Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tv_grab_na_dtv stopped working #80

Closed
MaximVol opened this issue Dec 22, 2019 · 13 comments
Closed

tv_grab_na_dtv stopped working #80

MaximVol opened this issue Dec 22, 2019 · 13 comments

Comments

@MaximVol
Copy link

XMLTV Version?

XMLTV module version 0.5.69
This is tv_grab_na_dtv version 1.24, 2016/11/23 19:41:36

XMLTV Component?

tv_grab_na_dtv

What happened?

tv_grab_na_dtv stopped working December 18, 2019. It hangs after connecting to directv.com and after some time terminates by timeout. At the same time, I can load DirectTV JSON in browser without any problems. I tried to change the user agent in code of tv_grab_na_dtv to latest version of Chrome, but this did not fix the issue.

What other software are you using?

Linux 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u1 (2019-09-20) x86_64 GNU/Linux
This is perl 5, version 24, subversion 1 (v5.24.1) built for x86_64-linux-gnu-thread-multi

@ejonesnospam
Copy link
Contributor

I have the same issue with tv_grab_na_dtv. No output data is provided from xmltv. Started approximately end of Dec, and no data has been collected since. Worked without issue for several years previously.

I'm an [old] developer, so if someone can point me in the right direction, I will happily take a look. Can't say I'll actually be an assistance, though ;)

Linux 4.19.66-v7+ #1253 SMP Thu Aug 15 11:49:46 BST 2019 armv7l GNU/Linux

@garybuhrmaster
Copy link
Contributor

I have no direc(TV? ha-ha) knowledge, but I strongly suspect they (finally) implemented rate limiting on their web site for anonymous requests (one way to test might be to change your channel list to only select something like 10 channels and see if it works for those 10). DirecTV would not be the first to address what they consider screen scraper abuse in that way. I have no idea if that means that to continue to use such a screen scrapper one has to make requests a lot slower, or break the requests into smaller session request bunches, or one needs to login to the site to continue making requests at a higher rate. Since DirecTV has not shared what they have done (and seem unlikely to do so), it will probably be a lot of trial and error to proceed.

For a subscriber of DirecTV, I would certainly suggest contacting them to see if they can provide a reliable (downloadable) EPG without the need for screen scrapping. There are also other known EPG sources (none free that I know of) that include DirecTV scheduled information that use formal (supported) APIs. One may need to consider moving in that direction for a more reliable fix if DirecTV is not willing/able to supply the data.

@ejonesnospam
Copy link
Contributor

Saying that you don't know of any free sources, then suggesting that one move towards that direction... can't say that's particularly helpful to me. If you know of actual rate-limiting actions that are in play by directTV, please provide that data. Otherwise, your input is just pure conjecture. I would like to move towards resolving the issue with "tv_grab_na_dtv", hence the primary reason for this issue page, and my follow up comments.

@MaximVol
Copy link
Author

MaximVol commented Jan 6, 2020

I don't think that it's relating to some rate limits or other restrictions.
I tried to test this link in different environments: https://www.directv.com/json/channels
It works well in all modern browsers: Chrome, FireFox and Edge.
It does not work with tv_grab_na_dtv, wget console command and Lynx text browser.

I have two ideas:

  1. Maybe DirectTV doesn't recognize tv_grab_na_dtv as valid browser. I tried to change the user agent in tv_grab_na_dtv code, but this does not help. Maybe something else should be corrected?

  2. tv_grab_na_dtv uses some SSL ciphers that are not supported by the DirectTV website anymore. But I have no idea how this can be tested.

@ejonesnospam
Copy link
Contributor

I've tested out a few various ideas of my own, and so far, my latest theory is related to the "--compressed" option of curl. When I add this option, I am able to download data from the directv website reliably. This option is enabled by default when using the latest chrome (on mac), but I'm unsure if it enabled by default by the xmltv include 'LWP::UserAgent'. I am currently testing a modified version, which enables this option for sure, and inital tests are looking good. I have completed two full downloads for my schedule.

MaximVol, if you upload your config file 'tv_grab_na_dtv.conf', I can test it as well.

@MaximVol
Copy link
Author

My config file is attached:
tv_grab_na_dtv.conf.zip

@ejonesnospam
Copy link
Contributor

ejonesnospam commented Jan 10, 2020

MaximVol

  • 2 xml runs attached, should be almost exactly the same. 1 log is included for your review and input.
  • Execution performed approximately 11am to 1pm, PST, Jan 10, 2020

maxvol.zip

General Notes:

  • Out of personal desire, there is now a program duration timer in the logs, should this be in the debug only option?
  • Default timeout for web requests is currently 240 seconds, which is why it sometimes takes hours for this operation to be completed. I'm currently testing a command line option, allowing the user to configure. So far, I'm seeing vast improvements with only a 10 second timeout.
  • Improper JSON responses (incomplete or malformed responses) no longer cause the entire operation to fail, and exit, with no data saved. Instead, a log of "Decoding JSON response failed." will appear.

To Do:

  • Create pull request
  • Further thought for coders - If above was truly an intended design element, I would think logic for retry or data download caching would need to implemented. We shouldn't keep scraping the same material repeatedly.

@MaximVol
Copy link
Author

@ejonesnospam, can you attach the code of tv_grab_na_dtv with your corrections, please?

@ejonesnospam
Copy link
Contributor

code has been published and pull request created

@MaximVol
Copy link
Author

@ejonesnospam, with your corrections, everything works well, thanks a lot!

@dimitry-ishenko
Copy link
Contributor

FWIW by simply adding the User-Agent header, I am able to download it with wget, eg:

wget -U '42' https://www.directv.com/json/channels

@knowledgejunkie
Copy link
Contributor

@MaximVol Your bug report was created using an old version of XMLTV (the current release version is 0.6.1) on Debian oldstable. Do you see similar problems using the code in the current git master branch (with no patches applied), using the simple config file provided at grab/na_dtv/test.conf?

Note that the test file only attempts to retrieve listings for a single channel and the grabber output validates normally.

@knowledgejunkie
Copy link
Contributor

@MaximVol I've tested git master with your config file (4 channels), and have replicated the timeout. Investigating further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants