Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't download any courses: 'NoneType' object has no attribute 'group' #92

Closed
crypdick opened this issue Feb 27, 2014 · 13 comments
Closed
Labels

Comments

@crypdick
Copy link

I am pretty sure this is not a duplicate of issue #53 but apologize if it is.

I have started to experience the AttributeError: 'NoneType' object has no attribute 'group' error on a edX course which, up to now, has not given me errors (BIO465x Neuronal Dynamics). I have the first 46 videos (into week 7).

When I run the script now this is what I get:

10 - Download them all
Enter Your Choice: 10
Processing 'https://courses.edx.org/courses/EPFLx/BIO465x/2013_OND/courseware/25fe09ce3c7d41a287f0733f0f97132a/8c8a46ab4055477687b496cd7739eecc/'...
Traceback (most recent call last):
  File "./edx-dl.py", line 403, in <module>
    main()
  File "./edx-dl.py", line 323, in main
    for id, container in zip(video_id[-len(id_container):], id_container)]
AttributeError: 'NoneType' object has no attribute 'group'

Just for reference, issue #53 has this error instead: File "./edx-dl.py", line 373, in main subs_filename = (match.group(1) or match.group(2)).decode('utf-8')[:-4] AttributeError: 'NoneType' object has no attribute 'group'

@crypdick
Copy link
Author

Ok, so I did a few tests. I tried to select specific weeks (no change) and, it looks like I am getting this error on all my courses. I downloaded the latest version of edX and tried again I am getting the same error (but on a different line):

  File "./edx-dl.py", line 414, in <module>
    main()
  File "./edx-dl.py", line 334, in main
    for id, container in zip(video_id[-len(id_container):], id_container)]
AttributeError: 'NoneType' object has no attribute 'group'

I found an upstream bug here ytdl-org/youtube-dl#2444 (comment)

However, I reinstall youtube-dl and beautifulsoup4 with pip but that doesn't fix the problem...

@hrzhu
Copy link

hrzhu commented Feb 27, 2014

I meet the same problem and I doubt it's youtube-dl(updating to the development youtube-dl doesn't solve it). It occurs very recently. My course is Louv1.01x Paradigms of Computer Programming. I could successfully download just a few days ago. Maybe edx has changed something?

@crypdick
Copy link
Author

Whoops! Accidentally closed the issue. @hrzhu I just noticed that that upstream bug is not quite the same error. I think you're right about edX changing.

@peterlazar1993
Copy link

Traceback (most recent call last):
File "edx-dl.py", line 414, in
main()
File "edx-dl.py", line 334, in main
for id, container in zip(video_id[-len(id_container):], id_container)]
AttributeError: 'NoneType' object has no attribute 'group'

i get the same error too , is there any fix yet ?

@ghost
Copy link

ghost commented Feb 27, 2014

Hello, I think I have the same issue :

Processing 'https://courses.edx.org/courses/UTAustinX/UT.9.01x/1T2014/courseware/ ect /'...
Traceback (most recent call last):
File "/Users/Julien/Downloads/edx-downloader-master-3/edx-dl.py", line 414, in
main()
File "/Users/Julien/Downloads/edx-downloader-master-3/edx-dl.py", line 334, in main
for id, container in zip(video_id[-len(id_container):], id_container)]
AttributeError: 'NoneType' object has no attribute 'group'

@shk3 shk3 added the bug label Feb 28, 2014
@shk3
Copy link
Member

shk3 commented Mar 1, 2014

Yeah. I am aware of this issue, but I am sorry that I can't fix it soon, since I am overwhelmed recently. Can anyone help to fix it?

slitvinov added a commit to slitvinov/edx-downloader that referenced this issue Mar 2, 2014
slitvinov added a commit to slitvinov/edx-downloader that referenced this issue Mar 2, 2014
@slitvinov
Copy link

It seems this regexp is broken now
https://github.com/shk3/edx-downloader/blob/master/edx-dl.py#L323

With this workaround
slitvinov/edx-downloader@shk3:master...master
one can download videos.

@Xiaohong-Deng
Copy link

With @slitvinov modification, subs extraction won't work hence the entire downloading process will be interrupted.

@slitvinov
Copy link

On Mar 2, 2014 11:52 PM, "Xiaohong-Deng" notifications@github.com

hence the entire downloading process will be interrupted.

What error message you have?

@Xiaohong-Deng
Copy link

@slitvinov
Traceback (most recent call last): File "edx-dl.py", line 418, in <module> main() File "edx-dl.py", line 396, in main subs_string = edx_get_subtitle(s, headers) File "edx-dl.py", line 159, in edx_get_subtitle jsonString = get_page_contents(url, headers) File "edx-dl.py", line 120, in get_page_contents result = urlopen(Request(url, None, headers)) File "C:\Coding\Python27564\lib\urllib2.py", line 127, in urlopen return _opener.open(url, data, timeout) File "C:\Coding\Python27564\lib\urllib2.py", line 396, in open protocol = req.get_type() File "C:\Coding\Python27564\lib\urllib2.py", line 258, in get_type raise ValueError, "unknown url type: %s" % self.__original ValueError: unknown url type:

@crypdick
Copy link
Author

crypdick commented Mar 3, 2014

@slitvinov Thank you for the hack! Most of my courses are downloading now :)

@iemejia
Copy link
Member

iemejia commented Mar 3, 2014

After some research I found that this bug actually must become 2 bugs, the first one is the obvious one that they changed the API that exposes the edx subtitles, they support now also multiple languages.

I have a commit that doesn't change a lot and fixes the subtitles part but it hardcodes the language for the moment to english (well most of the videos don't have translations for the moment, so i don't think this will be an issue for the moment).

The second bug is the fact that if I don't want to download subtitles the whole error doesn't make any sense, but fixing this one requires some extra refactoring.

@slitvinov
Copy link

@Xiaohong-Deng Thanks. It seems you are trying to download subtitles. My hack cannot handle it. I am sorry for not stating it clearly. I think a fix from @iemejia should be more helpful.

@iemejia iemejia closed this as completed in 0105db7 Mar 5, 2014
rbrito pushed a commit that referenced this issue Mar 5, 2014
Fix #92 Updated subtitle URL string after edx changes.
@AmrFathi
Copy link

Same problem is here!

Processing 'https://courses.edx.org/courses/BerkeleyX/CS_CS169.1x/1T2014/courseware/aec573b66478440986fa3d07074b3b91/a4de3cf2fef548169ff06484abd80e3b/'...
Processing 'https://courses.edx.org/courses/BerkeleyX/CS_CS169.1x/1T2014/courseware/75fafdf4d7204f9fa92e86c351cf8635/ad1c28f913254a76a68351062bffe37f/'...
Traceback (most recent call last):
File "edx-dl.py", line 445, in
main()
File "edx-dl.py", line 366, in main
for id, container in zip(video_id[-len(id_container):], id_container)]
IndexError: no such group

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants