AttributeError: 'HTMLParser' object has no attribute 'unescape' #778

rm00-git · 2020-10-08T20:13:09Z

🚨Please review the Troubleshooting section
before reporting any issue. Don't forget to check also the current issues to
avoid duplicates.

AttributeError

Receiving the following error:

AttributeError: 'HTMLParser' object has no attribute 'unescape'

Your environment

Operating System (name/version): Windows V.2004
Python version: 3.9
coursera-dl version: 0.11.5

Steps to reproduce

Method:

coursera-dl regression-models

Is the problem happening with the latest version of the script?
Do you have all the recommended versions of the modules? See them in the
file requirements.txt.
What is the course that you are trying to access?
What is the precise command line that you are using (don't forget to obfuscate
your username and password, but leave all other information untouched).
What are the precise messages that you get? Please, use the --debug
option before posting the messages as a bug report. Please, copy and paste
them. Don't reword/paraphrase the messages.

Expected behaviour

Tell us what should happen.

Actual behaviour

C:\Users\ryan1\Documents>coursera-dl regression-models
coursera_dl version 0.11.5
Downloading class: regression-models (1 / 1)
Parsing syllabus of on-demand course . This may take some time, please be patient ...
Processing module week-1-least-squares-and-linear-regression
Processing section introduction
Processing lecture welcome-to-regression-models (supplement)
Traceback (most recent call last):
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\ryan1\AppData\Local\Programs\Python\Python39\Scripts\coursera-dl.exe_main.py", line 7, in
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\coursera_dl.py", line 247, in main
error_occurred, completed = download_class(
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\coursera_dl.py", line 214, in download_class
return download_on_demand_class(session, args, class_name)
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\coursera_dl.py", line 134, in download_on_demand_class
error_occurred, modules = extractor.get_modules(
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\extractors.py", line 53, in get_modules
error_occurred, modules = self._parse_on_demand_syllabus(
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\extractors.py", line 161, in _parse_on_demand_syllabus
links = course.extract_links_from_supplement(
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\api.py", line 1268, in extract_links_from_supplement
supplement_content, self._extract_links_from_text(value))
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\api.py", line 1518, in _extract_links_from_text
supplement_links = self._extract_links_from_a_tags_in_text(text)
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\api.py", line 1597, in _extract_links_from_a_tags_in_text
extension = clean_filename(
File "c:\users\ryan1\appdata\local\programs\python\python39\lib\site-packages\coursera\utils.py", line 118, in clean_filename
s = h.unescape(s)
AttributeError: 'HTMLParser' object has no attribute 'unescape'

The text was updated successfully, but these errors were encountered:

gustavoconter · 2020-10-09T15:20:20Z

Hi! I was receiving the exact same error, did some research and I discovered that in python 3.9.0 HTMLParser.unescape was removed, so I switched back to python 3.8 and it is working perfectly fine. Switching to a older version that is greater than 3.4 should work. Hope it helps!

rm00-git · 2020-10-09T16:06:09Z

Hi! I was receiving the exact same error, did some research and I discovered that in python 3.9.0 HTMLParser.unescape was removed, so I switched back to python 3.8 and it is working perfectly fine. Switching to a older version that is greater than 3.4 should work. Hope it helps!

Thanks, that worked!

zenny · 2020-10-30T16:32:47Z

Here is a patch without needing to downgrade to python3.8 coursera-dl/edx-dl@5490a99 works in linux.

VigneshRamanathan101 · 2020-11-20T18:53:13Z

Here is a patch without needing to downgrade to python3.8 coursera-dl/edx-dl@5490a99 works in linux.

I tried copying the file into Coursera folder but did help got ImportError: cannot import name 'random_string' from 'coursera.utils' error

manzoorHusain · 2020-12-18T08:15:35Z

@rm00-git how to go to previous version of python . Please tell me. Much needed.

manzoorHusain · 2020-12-18T09:08:51Z

@rm00-git Thank you so much man. I really appretiate it.

considering backward compatibility Python 3.9.0 changelog: https://docs.python.org/release/3.9.0/whatsnew/changelog.html

ziko442 · 2020-12-22T14:40:26Z

first thanks @zenny

go to C:\Users\{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:
=> open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

rohitbalage · 2021-01-09T06:37:05Z

Hi! I was receiving the exact same error, did some research and I discovered that in python 3.9.0 HTMLParser.unescape was removed, so I switched back to python 3.8 and it is working perfectly fine. Switching to a older version that is greater than 3.4 should work. Hope it helps!

Thanks, it worked!

Nirbhay-Thacker · 2021-01-11T09:15:45Z

Here is a patch without needing to downgrade to python3.8 coursera-dl/edx-dl@5490a99 works in linux.

refer to ziko442's reply to fix the issue with coursera-dl, his reply is for edx-dl.

rwilcox3 · 2021-01-30T07:03:48Z

If you modify coursera\utils.py to import html
and then replace h = html_parser.HTMLParser()
with
h = html

3.9 Works. Suggest the dev team to implement this change.

eliottness · 2021-02-05T18:41:13Z

If you modify coursera\utils.py to import html
and then replace h = html_parser.HTMLParser()
with
h = html

3.9 Works. Suggest the dev team to implement this change.

It worked for me on debian buster with bullseye testing repositories.

michael12987 · 2021-02-12T09:49:06Z

If you modify coursera\utils.py to import html
and then replace h = html_parser.HTMLParser()
with
h = html

3.9 Works. Suggest the dev team to implement this change.

Hi i did it and still getting" AttributeError: 'HTMLParser' object has no attribute 'unescape'" any idea what i can do more to solve it?
Thanks!

lifepillar · 2021-03-12T16:23:10Z

@adirb1 Try also commenting out this line:

from six.moves import html_parser

And you need to replace two occurrences of h = html_parser.HTMLParser() with h = html.

ismail709 · 2021-03-16T10:39:19Z

#789 (comment)

Edw590 · 2021-03-20T16:04:55Z

first thanks @zenny

go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:

=> open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

I think there may be a problem with that change. Here, html.unescape() is only available from Python 3.4. Bellow that, it doesn't exist. So I think your code may not work on Python from 3.0 through 3.3.

You might want to change the line:

if sys.version_info[0] >= 3:

to

if sys.version_info[0] >= 3 and sys.version_info[1] >= 4:

I'm not using Python 3.3 (using 3.9), but only found your reply after having read about Python 3.4 as minimum, so thought you might want to correct the file. Thanks though. I'll go for that instead of hard-coding the new way as I was doing.

idrissathiam01 · 2021-04-01T19:19:26Z

The software works for me now using the CAUTH flag and ziko's instructions. Thank you so much.

OS Name: Microsoft Windows 10 Enterprise
Version: 10.0.19043 Build 19043
System Type: x64-based PC

first thanks @zenny

go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:

=> open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

heino · 2021-04-03T08:29:44Z

Pull request #789 should fix this issue...

ruslaniv · 2021-04-22T09:47:48Z

first thanks @zenny

go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:

=> open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

Worked in OSX Catalina as well.

v1a0 · 2021-05-08T12:15:13Z

It works for me:

apt install python3.9-dev

MAKE SURE YOU INSTALL IT FOR RIGHT PYTHON VERSION! (`python3.x-dev`)

If you use python 3.x install `python3.x-dev`, and so on

Vaishnavi-A27 · 2021-12-28T05:47:33Z

@rm00-git how to go to previous version of python . Please tell me. Much needed.

sudo update-alternatives --config python3
you should get a table with the different versions of python. select the option that has your older version of python

TeymurovFuad · 2021-12-28T23:10:53Z

first thanks @zenny

1. go to C:\Users\{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
   note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:

2. => open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

Thanks

heino · 2021-12-28T23:50:06Z

This issue was fixed by pull request #789 (as mentioned above),

As such, there is no reason to risk security hazards by resorting to replacing large amounts of code...

Ali619 · 2022-01-17T08:51:29Z

first thanks @zenny

go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:

=> open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

If anybody using Coursera package, just do this if you can't fix the utils.py file and it will be fix

khatiwada1 · 2022-01-19T16:02:44Z

first thanks @zenny

go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:

=> open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

This is great. This works for Python 3.10.2 as well.

Tcoton · 2022-02-20T19:57:52Z

The software works for me now using the CAUTH flag and ziko's instructions. Thank you so much.

OS Name: Microsoft Windows 10 Enterprise Version: 10.0.19043 Build 19043 System Type: x64-based PC

first thanks @zenny

go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:

=> open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

This worked on my Mac OS X 11.6.3 with Python 3.9 as of today date except the path of utils.py is:

/Users/<your_username>/opt/anaconda3/lib/python3.9/site-packages/coursera/utils.py

I modified the file as per above link and I could complete downloading the few courses which were throwing the html parser error. I also had to get the CAUTH using Safari development tools/web inspector. Not very cool to have to crawl the internet to get it working but well worth the time saved to download all courses. The only thing which are not downloaded at all are all the readings contained in an iframe.

shwhsx · 2022-05-14T13:06:45Z

I replaced util.py with #778 but still not working. There is an error saying "XXXX/python3.9/site-packages/coursera/utils.py", line 118, in clean_filename
s = h.unescape(s)
AttributeError: 'HTMLParser' object has no attribute 'unescape'"

Not sure what went wrong. I already set h = html

faea726 · 2022-06-25T06:57:06Z

Fixed. Thanks a lot!

If you modify coursera\utils.py to import html and then replace h = html_parser.HTMLParser() with h = html

3.9 Works. Suggest the dev team to implement this change.

I edited utils.py
import html
comment out from six.moves import html_parser
replace h = html_parser.HTMLParser() with h = html (2 position)

Then download with command coursera-dl -ca <some_cookies_value_get_from_browser> <course_name>

magombe · 2022-07-13T16:37:48Z

first thanks @zenny

go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:

=> open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

Thanks this has worked for me

bethel-m · 2023-06-28T14:31:54Z

first thanks @zenny

go to C:\Users{ur_usr_name}\AppData\Local\Programs\Python\Python39\Lib\site-packages\coursera:
note : if u install coursera-dl in a venve go to: ur-v-env-name\Lib\site-packages\coursera:

=> open utils.py file and replace all the code with the code in this link : https://gist.github.com/ziko442/d57d91da980e72414c725eb60878bc2d

this help me.. thanks ...
Am using ubuntu 23

bilel mentioned this issue Dec 3, 2020

again folders are empty and nothing is downloaded coursera-dl/edx-dl#655

Closed

jackforfaltu mentioned this issue Dec 11, 2020

Only folder structure without videos coursera-dl/edx-dl#649

Open

1c7 referenced this issue in coursera-dl/edx-dl Dec 21, 2020

Update for deprecated HTMLParser.unescape for python >=3.9

5490a99

considering backward compatibility Python 3.9.0 changelog: https://docs.python.org/release/3.9.0/whatsnew/changelog.html

WillNilges mentioned this issue Jan 17, 2021

AttributeError: 'HTMLParser' object has no attribute 'unescape' trackmastersteve/alienfx#87

Open

xiaobojiang mentioned this issue Jan 21, 2021

unescape example @ 2.17 在字符串中处理html和xml not working on Python3.9 yidao620c/python3-cookbook#342

Open

ZXSwire3 mentioned this issue Feb 28, 2021

HTTPError: 400 Client Error: Bad Request for url: https://api.coursera.org/api/login/v3 #702

Closed

zenny mentioned this issue Jun 29, 2021

[SOLVED] Coursera-dl stopped working completely with Error 403 Client Error due to upstream changes! #800

Closed

mfiers mentioned this issue Jul 27, 2021

Installation seems incompatible with python 3.9+ STOmics/Stereopy#11

Closed

plainas mentioned this issue Sep 1, 2021

Remove bound for setuptools. plainas/tq#22

Closed

JoAnFe mentioned this issue Oct 31, 2021

unable to download course #792

Open

allentiak mentioned this issue Jan 2, 2022

Fix AttributeError: 'HTMLParser' object has no attribute 'unescape' #789

Open

9 tasks

Espionage724 mentioned this issue Sep 2, 2022

Getting this to work on Fedora 36 (likely other distros in 2022) #832

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: 'HTMLParser' object has no attribute 'unescape' #778

AttributeError: 'HTMLParser' object has no attribute 'unescape' #778

rm00-git commented Oct 8, 2020 •

edited

Loading

gustavoconter commented Oct 9, 2020

rm00-git commented Oct 9, 2020

zenny commented Oct 30, 2020

VigneshRamanathan101 commented Nov 20, 2020

manzoorHusain commented Dec 18, 2020

manzoorHusain commented Dec 18, 2020

ziko442 commented Dec 22, 2020 •

edited

Loading

rohitbalage commented Jan 9, 2021

Nirbhay-Thacker commented Jan 11, 2021 •

edited

Loading

rwilcox3 commented Jan 30, 2021

eliottness commented Feb 5, 2021

michael12987 commented Feb 12, 2021 •

edited

Loading

lifepillar commented Mar 12, 2021

ismail709 commented Mar 16, 2021

Edw590 commented Mar 20, 2021 •

edited

Loading

idrissathiam01 commented Apr 1, 2021

heino commented Apr 3, 2021 •

edited

Loading

ruslaniv commented Apr 22, 2021

v1a0 commented May 8, 2021 •

edited

Loading

Vaishnavi-A27 commented Dec 28, 2021

TeymurovFuad commented Dec 28, 2021

heino commented Dec 28, 2021 •

edited

Loading

Ali619 commented Jan 17, 2022 •

edited

Loading

khatiwada1 commented Jan 19, 2022

Tcoton commented Feb 20, 2022

shwhsx commented May 14, 2022

faea726 commented Jun 25, 2022 •

edited

Loading

magombe commented Jul 13, 2022

bethel-m commented Jun 28, 2023

AttributeError: 'HTMLParser' object has no attribute 'unescape' #778

AttributeError: 'HTMLParser' object has no attribute 'unescape' #778

Comments

rm00-git commented Oct 8, 2020 • edited Loading

AttributeError

Your environment

Steps to reproduce

Expected behaviour

Actual behaviour

gustavoconter commented Oct 9, 2020

rm00-git commented Oct 9, 2020

zenny commented Oct 30, 2020

VigneshRamanathan101 commented Nov 20, 2020

manzoorHusain commented Dec 18, 2020

manzoorHusain commented Dec 18, 2020

ziko442 commented Dec 22, 2020 • edited Loading

rohitbalage commented Jan 9, 2021

Nirbhay-Thacker commented Jan 11, 2021 • edited Loading

rwilcox3 commented Jan 30, 2021

eliottness commented Feb 5, 2021

michael12987 commented Feb 12, 2021 • edited Loading

lifepillar commented Mar 12, 2021

ismail709 commented Mar 16, 2021

Edw590 commented Mar 20, 2021 • edited Loading

idrissathiam01 commented Apr 1, 2021

heino commented Apr 3, 2021 • edited Loading

Pull request #789 should fix this issue...

ruslaniv commented Apr 22, 2021

v1a0 commented May 8, 2021 • edited Loading

MAKE SURE YOU INSTALL IT FOR RIGHT PYTHON VERSION! (python3.x-dev)

If you use python 3.x install python3.x-dev, and so on

Vaishnavi-A27 commented Dec 28, 2021

TeymurovFuad commented Dec 28, 2021

heino commented Dec 28, 2021 • edited Loading

This issue was fixed by pull request #789 (as mentioned above),

Ali619 commented Jan 17, 2022 • edited Loading

khatiwada1 commented Jan 19, 2022

Tcoton commented Feb 20, 2022

shwhsx commented May 14, 2022

faea726 commented Jun 25, 2022 • edited Loading

magombe commented Jul 13, 2022

bethel-m commented Jun 28, 2023

rm00-git commented Oct 8, 2020 •

edited

Loading

ziko442 commented Dec 22, 2020 •

edited

Loading

Nirbhay-Thacker commented Jan 11, 2021 •

edited

Loading

michael12987 commented Feb 12, 2021 •

edited

Loading

Edw590 commented Mar 20, 2021 •

edited

Loading

heino commented Apr 3, 2021 •

edited

Loading

v1a0 commented May 8, 2021 •

edited

Loading

MAKE SURE YOU INSTALL IT FOR RIGHT PYTHON VERSION! (`python3.x-dev`)

If you use python 3.x install `python3.x-dev`, and so on

heino commented Dec 28, 2021 •

edited

Loading

Ali619 commented Jan 17, 2022 •

edited

Loading

faea726 commented Jun 25, 2022 •

edited

Loading