Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'HTMLParser' object has no attribute 'unescape' #778

Open
rm00-git opened this issue Oct 8, 2020 · 29 comments
Open

AttributeError: 'HTMLParser' object has no attribute 'unescape' #778

rm00-git opened this issue Oct 8, 2020 · 29 comments

Comments

@rm00-git
Copy link

rm00-git commented Oct 8, 2020

🚨Please review the Troubleshooting section
before reporting any issue. Don't forget to check also the current issues to
avoid duplicates.

AttributeError

Receiving the following error:

AttributeError: 'HTMLParser' object has no attribute 'unescape'

Your environment

  • Operating System (name/version): Windows V.2004
  • Python version: 3.9
  • coursera-dl version: 0.11.5

Steps to reproduce

Method:

coursera-dl regression-models

  • Is the problem happening with the latest version of the script?
  • Do you have all the recommended versions of the modules? See them in the
    file requirements.txt.
  • What is the course that you are trying to access?
  • What is the precise command line that you are using (don't forget to obfuscate
    your username and password, but leave all other information untouched).
  • What are the precise messages that you get? Please, use the --debug
    option before posting the messages as a bug report. Please, copy and paste
    them. Don't reword/paraphrase the messages.

Expected behaviour

Tell us what should happen.

Actual behaviour

C:\Users\ryan1\Documents>coursera-dl regression-models
coursera_dl version 0.11.5
Downloading class: regression-models (1 / 1)
Parsing syllabus of on-demand course . This may take some time, please be patient ...
Processing module week-1-least-squares-and-linear-regression
Processing section introduction
Processing lecture welcome-to-regression-models (supplement)
Traceback (most recent call last):
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\ryan1\AppData\Local\Programs\Python\Python39\Scripts\coursera-dl.exe_main
.py", line 7, in
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\coursera_dl.py", line 247, in main
error_occurred, completed = download_class(
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\coursera_dl.py", line 214, in download_class
return download_on_demand_class(session, args, class_name)
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\coursera_dl.py", line 134, in download_on_demand_class
error_occurred, modules = extractor.get_modules(
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\extractors.py", line 53, in get_modules
error_occurred, modules = self._parse_on_demand_syllabus(
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\extractors.py", line 161, in _parse_on_demand_syllabus
links = course.extract_links_from_supplement(
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\api.py", line 1268, in extract_links_from_supplement
supplement_content, self._extract_links_from_text(value))
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\api.py", line 1518, in _extract_links_from_text
supplement_links = self._extract_links_from_a_tags_in_text(text)
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\api.py", line 1597, in _extract_links_from_a_tags_in_text
extension = clean_filename(
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\utils.py", line 118, in clean_filename
s = h.unescape(s)
AttributeError: 'HTMLParser' object has no attribute 'unescape'

@gustavoconter
Copy link

Hi! I was receiving the exact same error, did some research and I discovered that in python 3.9.0 HTMLParser.unescape was removed, so I switched back to python 3.8 and it is working perfectly fine. Switching to a older version that is greater than 3.4 should work. Hope it helps!

@rm00-git
Copy link
Author

rm00-git commented Oct 9, 2020

Hi! I was receiving the exact same error, did some research and I discovered that in python 3.9.0 HTMLParser.unescape was removed, so I switched back to python 3.8 and it is working perfectly fine. Switching to a older version that is greater than 3.4 should work. Hope it helps!

Thanks, that worked!

@zenny
Copy link

zenny commented Oct 30, 2020

Here is a patch without needing to downgrade to python3.8 coursera-dl/edx-dl@5490a99 works in linux.

@VigneshRamanathan101
Copy link

Here is a patch without needing to downgrade to python3.8 coursera-dl/edx-dl@5490a99 works in linux.

I tried copying the file into Coursera folder but did help got ImportError: cannot import name 'random_string' from 'coursera.utils' error

@manzoorHusain
Copy link

@rm00-git how to go to previous version of python . Please tell me. Much needed.

@manzoorHusain
Copy link

@rm00-git Thank you so much man. I really appretiate it.

1c7 referenced this issue in coursera-dl/edx-dl Dec 21, 2020
@ziko442
Copy link

ziko442 commented Dec 22, 2020

first thanks @zenny

  1. go to C:\Users\{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
    note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:

  2. => open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

@rohitbalage
Copy link

Hi! I was receiving the exact same error, did some research and I discovered that in python 3.9.0 HTMLParser.unescape was removed, so I switched back to python 3.8 and it is working perfectly fine. Switching to a older version that is greater than 3.4 should work. Hope it helps!

Thanks, it worked!

@Nirbhay-Thacker
Copy link

Nirbhay-Thacker commented Jan 11, 2021

Here is a patch without needing to downgrade to python3.8 coursera-dl/edx-dl@5490a99 works in linux.

refer to ziko442's reply to fix the issue with coursera-dl, his reply is for edx-dl.

@rwilcox3
Copy link

If you modify coursera\utils.py to import html
and then replace h = html_parser.HTMLParser()
with
h = html

3.9 Works. Suggest the dev team to implement this change.

@eliottness
Copy link

If you modify coursera\utils.py to import html
and then replace h = html_parser.HTMLParser()
with
h = html

3.9 Works. Suggest the dev team to implement this change.

It worked for me on debian buster with bullseye testing repositories.

@michael12987
Copy link

michael12987 commented Feb 12, 2021

If you modify coursera\utils.py to import html
and then replace h = html_parser.HTMLParser()
with
h = html

3.9 Works. Suggest the dev team to implement this change.

Hi i did it and still getting" AttributeError: 'HTMLParser' object has no attribute 'unescape'" any idea what i can do more to solve it?
Thanks!

@lifepillar
Copy link

@adirb1 Try also commenting out this line:

from six.moves import html_parser

And you need to replace two occurrences of h = html_parser.HTMLParser() with h = html.

@ismail709
Copy link

#789 (comment)

@Edw590
Copy link

Edw590 commented Mar 20, 2021

first thanks @zenny

  1. go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
    note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:
  2. => open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

I think there may be a problem with that change. Here, html.unescape() is only available from Python 3.4. Bellow that, it doesn't exist. So I think your code may not work on Python from 3.0 through 3.3.

You might want to change the line:

if sys.version_info[0] >= 3:

to

if sys.version_info[0] >= 3 and sys.version_info[1] >= 4:

I'm not using Python 3.3 (using 3.9), but only found your reply after having read about Python 3.4 as minimum, so thought you might want to correct the file. Thanks though. I'll go for that instead of hard-coding the new way as I was doing.

@idrissathiam01
Copy link

The software works for me now using the CAUTH flag and ziko's instructions. Thank you so much.

OS Name: Microsoft Windows 10 Enterprise
Version: 10.0.19043 Build 19043
System Type: x64-based PC

first thanks @zenny

  1. go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
    note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:
  2. => open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

@heino
Copy link

heino commented Apr 3, 2021

Pull request #789 should fix this issue...

@ruslaniv
Copy link

first thanks @zenny

  1. go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
    note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:
  2. => open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

Worked in OSX Catalina as well.

@v1a0
Copy link

v1a0 commented May 8, 2021

It works for me:

apt install python3.9-dev

MAKE SURE YOU INSTALL IT FOR RIGHT PYTHON VERSION! (python3.x-dev)

If you use python 3.x install python3.x-dev, and so on

@Vaishnavi-A27
Copy link

@rm00-git how to go to previous version of python . Please tell me. Much needed.

sudo update-alternatives --config python3
you should get a table with the different versions of python. select the option that has your older version of python

@TeymurovFuad
Copy link

first thanks @zenny

1. go to C:\Users\{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
   note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:

2. => open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

Thanks

@heino
Copy link

heino commented Dec 28, 2021

This issue was fixed by pull request #789 (as mentioned above),

As such, there is no reason to risk security hazards by resorting to replacing large amounts of code...

@Ali619
Copy link

Ali619 commented Jan 17, 2022

first thanks @zenny

  1. go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
    note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:
  2. => open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

If anybody using Coursera package, just do this if you can't fix the utils.py file and it will be fix

@khatiwada1
Copy link

first thanks @zenny

  1. go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
    note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:
  2. => open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

This is great. This works for Python 3.10.2 as well.

@Tcoton
Copy link

Tcoton commented Feb 20, 2022

The software works for me now using the CAUTH flag and ziko's instructions. Thank you so much.

OS Name: Microsoft Windows 10 Enterprise Version: 10.0.19043 Build 19043 System Type: x64-based PC

first thanks @zenny

  1. go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
    note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:
  2. => open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

This worked on my Mac OS X 11.6.3 with Python 3.9 as of today date except the path of utils.py is:

/Users/<your_username>/opt/anaconda3/lib/python3.9/site-packages/coursera/utils.py

I modified the file as per above link and I could complete downloading the few courses which were throwing the html parser error. I also had to get the CAUTH using Safari development tools/web inspector. Not very cool to have to crawl the internet to get it working but well worth the time saved to download all courses. The only thing which are not downloaded at all are all the readings contained in an iframe.

@shwhsx
Copy link

shwhsx commented May 14, 2022

I replaced util.py with #778 but still not working. There is an error saying "XXXX/python3.9/site-packages/coursera/utils.py", line 118, in clean_filename
s = h.unescape(s)
AttributeError: 'HTMLParser' object has no attribute 'unescape'"

Not sure what went wrong. I already set h = html

@faea726
Copy link

faea726 commented Jun 25, 2022

Fixed. Thanks a lot!

If you modify coursera\utils.py to import html and then replace h = html_parser.HTMLParser() with h = html

3.9 Works. Suggest the dev team to implement this change.

I edited utils.py
import html
comment out from six.moves import html_parser
replace h = html_parser.HTMLParser() with h = html (2 position)

Then download with command coursera-dl -ca <some_cookies_value_get_from_browser> <course_name>

@magombe
Copy link

magombe commented Jul 13, 2022

first thanks @zenny

  1. go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
    note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:
  2. => open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

Thanks this has worked for me

@bethel-m
Copy link

first thanks @zenny

  1. go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
    note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:
  2. => open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

this help me.. thanks ...
Am using ubuntu 23

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests