Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about using one _real_extract to fulfill data for another situation #24502

Closed
mshiels opened this issue Mar 27, 2020 · 3 comments
Closed
Labels

Comments

@mshiels
Copy link

@mshiels mshiels commented Mar 27, 2020

Checklist

  • I'm asking a question
  • I've looked through the README and FAQ for similar questions
  • I've searched the bugtracker for similar questions including closed ones

Question

I have been working on a number of 'whole show' level changes to various extractors and have ben running them on a much older version of youtube_dl and generally it was a copy of the normal extractor wrapped inside some loops over video_ids etc.

Now I have updated to the latest version and am trying to re-integrate my code to possibly submit for inclusion and was wondering what I am missing since when I try and do a simplification and literally have my say NBCShowIE call the NBCIE to do the information extraction, I am running into a problem I have not seen before where I get the following error:

ERROR: 'NonType' object has no attribute 'get'

I am sure I am missing something stupid and silly but I am stumped right now. My normal code is to literally accept the show URL and then extract via scraping any identifiers I can find and in the new style was hoping to just pass each one to the normal processing code which already has extraction logic. This will ensure my code doesn't have duplication of existing stuff which will make it so much easier to maintain.

But I just need some sort of concept clarification or smack on the head - something basic is wrong here and I swear I ran into it when I first did these, but that was back in 2018.

And here is a simple trace from say NBC The Voice

https://www.nbc.com/the-voice/episodes
ParseResult(scheme='https', netloc='www.nbc.com', path='/the-voice/episodes', params='', query='', fragment='')
[NBCShow] with/episodes: Downloading webpage
['/the-voice/video/the-battles-premiere/4135310', '/the-voice/video/the-blind-auditions-part-5/4131245', '/the-voice/video/the-blind-auditions-part-4/4127790',
'/the-voice/video/the-blind-auditions-part-3/4123730', '/the-voice/video/the-blind-auditions-part-2/4119821', '/the-voice/video/the-blind-auditions-season-premiere/4119820']
/the-voice/video/the-battles-premiere/4135310
https://www.nbc.com/the-voice/video/the-battles-premiere/4135310
/the-voice/video/the-blind-auditions-part-5/4131245
https://www.nbc.com/the-voice/video/the-blind-auditions-part-5/4131245
/the-voice/video/the-blind-auditions-part-4/4127790
https://www.nbc.com/the-voice/video/the-blind-auditions-part-4/4127790
/the-voice/video/the-blind-auditions-part-3/4123730
https://www.nbc.com/the-voice/video/the-blind-auditions-part-3/4123730
/the-voice/video/the-blind-auditions-part-2/4119821
https://www.nbc.com/the-voice/video/the-blind-auditions-part-2/4119821
/the-voice/video/the-blind-auditions-season-premiere/4119820
https://www.nbc.com/the-voice/video/the-blind-auditions-season-premiere/4119820
[download] Downloading playlist: None
[NBCShow] playlist None: Collected 6 video ids (downloading 6 of them)
[download] Downloading video 1 of 6
ERROR: 'NoneType' object has no attribute 'get'

@mshiels mshiels added the question label Mar 27, 2020
@mshiels
Copy link
Author

@mshiels mshiels commented Mar 28, 2020

Well Duh, forgot to inherit from th epropr class - so got futher.

If I just run one of my 'urls' through they are fine for each episode of the show, but trying to pass them from NBCShowIE to NBCIE using this

super(NBCShowIE,self)._real_extract(urlbits.scheme + '://' + urlbits.netloc + video_id)

ends up not passing the regex?

Hmm. It's a problem with of course class inheritance. So for now I changed _VALID_URL to also keep a local copy in say _classname_VALID_URL - which it internally uses, that way you can call one class from another and it still uses it's local URL instead of the overwritten value in the child class.

So that worked well, but now I realized I think why I copied more code than I wanted. Once I have appended all the episodes - some episodes may trip a ExtractorError since they are APO needed, but other episodes do not, but the self.playlist_result won't continue even with '-i' option it seems. Is that normal/right?

Sorry for the saga here, but if I can get some of this stuff integrated it should be usefull I hope. Been using alot of it for a few years now extracting daily the shows from most major US and CBC (here in CAnada) channels.

@mshiels
Copy link
Author

@mshiels mshiels commented Mar 28, 2020

So NBC is now working, the code will need a bit of cleaning etc, but it might be the first donation. Will accept a show URL and supports 2 or 3 formats, depending on the ages of the show some might still be working. Basically we try stuff that worked first, if it fails, we keep going till we find some video_id signatures the code likes. Then kaboom quick download of a whole show, season etc.

@dstftw dstftw closed this Mar 28, 2020
@mshiels
Copy link
Author

@mshiels mshiels commented Mar 28, 2020

Is there somewhere people working on this product can talk/communicate things other than incidents??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.