Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
A model, well-commented extractor #11968
Comments
|
That's a great suggestion. However, in youtube-dl there's no uniform coding convention. Coding styles change fast with time, so new or recent changed extractors looks quite different than earlier extractors. (I would rather say my early contributions are like a shit :D) To make things worse, coding styles between major developers are different as they may be involved in other Python projects. I don't think such a model extractor can really exist. If you really need well-written extractors, consider those who are written mostly by major developers (dstftw, phihag, jaimeMF, remitamine and me). If you really need concrete examples, you may consider to have a look into litv.py, bilibili.py and niconico.py - those are my recent changes. About comments: It's also a great idea. I prefer to document functions where they are defined. At least |
Model extractorsI'm not terribly worried about uniform coding conventions — several choices would be fine, even if they're conflicting. Multiple model extractors would be fine. And the point here isn't to use to critique submissions and fail them because of style; it is intended as an aid to developers to figure out recommendations that might make sense for them, and to pick and choose them. The particular boring and pedestrian example this morning was finding an example where Still, it would be helpful if those concrete examples were listed somewhere, maybe CommentsComments and documentation strings are different, and they are both useful and compliment each other. Indeed, pydoc does assemble the documentation strings into HTML and the results have been useful to me. But that's not the same as showing the narrative for how an extractor should be written and the reasonable choices, and maybe why you should call Really, the difference between comments in example code and docstrings is the same as he difference between a User's Guide and a Reference Manual. Both are useful, but sometimes reading function definitions in alphabetical order isn't the best way to learn. |
A list of available APIs should solve the problem. I guess pydoc or alternatives can generate HTML documents just like docs.python.org or readthedocs.org. New comers can read them first.
For non-functionality styles, there's much freedom!
I guess problems will be gone when API documents are (almost) complete.
It may look like:
Documents are enought, aren't they?
I guess some tools can put functions into groups? docs.python.org do that very well Of course a user guide is also helpful. It can contain primary stages for writing an extractor. Common functions like _html_search_regex and _search_regex will be mentioned in descriptions. |
model extractors
I'm pretty skeptical! But it's easy to remove them later, so let's not let the "best be the enemy of the good." docstrings
I think they are not! Again, no matter how good the documentation of the function is, that is not the same as seeing an actual example of how it is used. And it's not typical to have examples of usage within docstrings (is it?).
Sure. But again, reading the purpose of a function and knowing how it may be invoked is not the same as seeing good examples of how to use it. All of these things are useful. But where we already have good examples among the 600+ extractors, it would be really helpful to have a way to find the good ones. function namingSo, there is already a good docstring for
But strangely, it doesn't show up in the documentation that Which leaves me to wonder, why do these functions have leading underscores at all? PEP 8 says:
And sure enough, pydoc excludes them. |
Since I'm not intimately familiar with the youtube-dl codebase, when I make a change to an extractor I often want to review other extractors to see how things have been handled previously. But oftentimes, it's really not clear which of several choices is better, what is favored, and what is more elegant.
If I had time to read through every single extractor I could make those determinations, but with 600+ of them, that's kind of daunting. Oftentimes I'll grep for a function and start looking at how abc.py and abcnews.py do it, since they are first alphabetically. But that's perhaps not a great plan.
It would be nice if there were a few well-written sample or "model" extractors that were highlighted as Best Practices that we could look at and review. And if those were clearly identified somehow, so I know where to look (maybe such model extractors exist already, I'm just not sure what to be looking at?). All things being equal, it'd be great if they were at the head of the alphabet, too :)
Thanks.
Also, the youtube-dl coding style seems to discourage comments. Maybe when you're very familiar with the code and the functions in common.py they don't seem so important, but for someone new to the codebase, they can help a lot. So it would be nice if any such model extractors could also be well-commented, or at least more verbosely commented than is typical in the codebase.