Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
The direct link detection (since 4e262a8) in generic will incorrectly assume HTML to be a direct link to a video if the HTML starts with BOM. Some examples:
This will render the generic extractor useless on some sites, see: #4534.
I've started working on a solution by simply striping the BOM, but soon realized that would not be enough, as we would need to decode the
first_bytesusing the corresponding encoding. So I though I'd bring this up, maybe you'll have better/simpler ideas?