New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Akamai HD player verification to the HDS module #222

Merged
merged 2 commits into from Dec 2, 2013

Conversation

Projects
None yet
5 participants
@vadmium

vadmium commented Nov 16, 2013

This requires passing in a hash identifying the player, combining it with a code from the manifest file, and then appending it to each fragment URL as a query string. So far I am using it to stream ABC Iview stuff (geo-restricted to Australia I understand) on the CLI, like this:

livestreamer hds://http://iviewum-vh.akamaihd.net/[. . .] pvhash='[. . .]'

The HDS URLs are printed by the livestreamer branch of python-iview at https://github.com/vadmium/python-iview.

@chrippa

This comment has been minimized.

Show comment
Hide comment
@chrippa

chrippa Nov 16, 2013

Owner

Hmm, I think this could be improved a bit.

  • Since you describe the process of getting a pvhash from a SWF, we might aswell just accept a swf argument instead and then create the hash ourself.
  • The urlappendix doesn't seem needed, you can just put the query params in the rsession.params dict instead. This also takes care of any quoting needed.
  • Don't use the logger to report the error about missing an argument when a pvtoken is found in the manifest, raise a IOError exception instead.
Owner

chrippa commented Nov 16, 2013

Hmm, I think this could be improved a bit.

  • Since you describe the process of getting a pvhash from a SWF, we might aswell just accept a swf argument instead and then create the hash ourself.
  • The urlappendix doesn't seem needed, you can just put the query params in the rsession.params dict instead. This also takes care of any quoting needed.
  • Don't use the logger to report the error about missing an argument when a pvtoken is found in the manifest, raise a IOError exception instead.
@vadmium

This comment has been minimized.

Show comment
Hide comment
@vadmium

vadmium Nov 18, 2013

All good ideas. I’ll update the code when I get a chance.

  • Since you describe the process of getting a pvhash from a SWF, we might
    aswell just accept a swf argument instead and then create the hash ourself.

That would probably be a good idea, though I think it should also try
to cache the hashes to avoid needlessly downloading the SWF over and
over, similar to what “rtmpdump --swfVfy” does (though the hash method
is different). I thought this would be a significant amount of work,
but now I found you already have some generic caching support, as used
in revision 85eb4fa. I’ll use that as a base, and hopefully add an
”If-Modified-Since” HTTP condition if it’s not too hard.

  • The urlappendix doesn't seem needed, you can just put the query params in
    the rsession.params dict instead. This also takes care of any quoting
    needed.

Will do, although I think I’ll need to parse apart the name=value
halves of one of the params, which comes directly from the <pv-2.0>
element.

  • Don't use the logger to report the error about missing an argument when a
    pvtoken is found in the manifest, raise a IOError exception instead.

Sounds reasonable.

vadmium commented Nov 18, 2013

All good ideas. I’ll update the code when I get a chance.

  • Since you describe the process of getting a pvhash from a SWF, we might
    aswell just accept a swf argument instead and then create the hash ourself.

That would probably be a good idea, though I think it should also try
to cache the hashes to avoid needlessly downloading the SWF over and
over, similar to what “rtmpdump --swfVfy” does (though the hash method
is different). I thought this would be a significant amount of work,
but now I found you already have some generic caching support, as used
in revision 85eb4fa. I’ll use that as a base, and hopefully add an
”If-Modified-Since” HTTP condition if it’s not too hard.

  • The urlappendix doesn't seem needed, you can just put the query params in
    the rsession.params dict instead. This also takes care of any quoting
    needed.

Will do, although I think I’ll need to parse apart the name=value
halves of one of the params, which comes directly from the <pv-2.0>
element.

  • Don't use the logger to report the error about missing an argument when a
    pvtoken is found in the manifest, raise a IOError exception instead.

Sounds reasonable.

@chrippa

This comment has been minimized.

Show comment
Hide comment
@chrippa

chrippa Nov 18, 2013

Owner

That would probably be a good idea, though I think it should also try to cache the hashes to avoid needlessly downloading the SWF over and over, similar to what “rtmpdump --swfVfy” does (though the hash method is different). I thought this would be a significant amount of work, but now I found you already have some generic caching support, as used in revision 85eb4fa. I’ll use that as a base, and hopefully add an ”If-Modified-Since” HTTP condition if it’s not too hard.

Yeah, that makes sense, I suggest you add a Cache object to the Stream base class similarliy to how the Plugin class does it.

Will do, although I think I’ll need to parse apart the name=value halves of one of the params, which comes directly from the <pv-2.0> element.

You can use the parse_qsd function (dict version of parse_qsl from urllib.parse) from utils for this.

Owner

chrippa commented Nov 18, 2013

That would probably be a good idea, though I think it should also try to cache the hashes to avoid needlessly downloading the SWF over and over, similar to what “rtmpdump --swfVfy” does (though the hash method is different). I thought this would be a significant amount of work, but now I found you already have some generic caching support, as used in revision 85eb4fa. I’ll use that as a base, and hopefully add an ”If-Modified-Since” HTTP condition if it’s not too hard.

Yeah, that makes sense, I suggest you add a Cache object to the Stream base class similarliy to how the Plugin class does it.

Will do, although I think I’ll need to parse apart the name=value halves of one of the params, which comes directly from the <pv-2.0> element.

You can use the parse_qsd function (dict version of parse_qsl from urllib.parse) from utils for this.

Martin Panter added some commits Sep 21, 2013

Martin Panter
stream.hds: Support Akamai HD player verification, via “pvhash” param…
…eter.

Reads <pv-2.0> element from manifest and sets up query string parameters to
be added to each fragment URL.
Martin Panter
stream.hds: Calculate and cache player hashes via “pvswf” parameter.
Hashes are saved in a new “stream.json” cache file. The SWF file is requested
with an “If-Modified-Since” condition if a cached hash is already available.
@vadmium

This comment has been minimized.

Show comment
Hide comment
@vadmium

vadmium Dec 2, 2013

Finally got round to updating this. I dropped the pvhash parameter for one called pvswf, and if player verification is needed, it caches the hash in ~/.cache/livestreamer/stream.json, which looks a little like this:

{"akamaihd-player:http://www.abc.net.au/iview/iview_388.swf": {"value": {"hash": "7ob1gDzeD6B33Q6WHsCoIlv6HQhCmcM4WGc36Y6bD+Q=", "modified": "Tue, 30 Apr 2013 01:16:58 GMT"}, "expires": 1386558567.4624891}}

I ended up just manually creating the Cache object when it was needed, because the Stream classes don’t seem to have anything like the Plugin.bind() class method that gets called for each plugin.

vadmium commented Dec 2, 2013

Finally got round to updating this. I dropped the pvhash parameter for one called pvswf, and if player verification is needed, it caches the hash in ~/.cache/livestreamer/stream.json, which looks a little like this:

{"akamaihd-player:http://www.abc.net.au/iview/iview_388.swf": {"value": {"hash": "7ob1gDzeD6B33Q6WHsCoIlv6HQhCmcM4WGc36Y6bD+Q=", "modified": "Tue, 30 Apr 2013 01:16:58 GMT"}, "expires": 1386558567.4624891}}

I ended up just manually creating the Cache object when it was needed, because the Stream classes don’t seem to have anything like the Plugin.bind() class method that gets called for each plugin.

chrippa added a commit that referenced this pull request Dec 2, 2013

Merge pull request #222 from vadmium/pv
Add support for Akamai HD player verification to the HDS module

@chrippa chrippa merged commit 1df161d into chrippa:develop Dec 2, 2013

1 check passed

default The Travis CI build passed
Details
@chrippa

This comment has been minimized.

Show comment
Hide comment
@chrippa

chrippa Dec 2, 2013

Owner

Looks good, thanks!

Owner

chrippa commented Dec 2, 2013

Looks good, thanks!

@athoik

This comment has been minimized.

Show comment
Hide comment
@athoik

athoik Dec 2, 2013

Contributor

Could you please make a function for check http date into utils?

Because not every machine has email.utils package, especially in embedded machines.

            # Only save in cache if a valid date is given
            if email.utils.parsedate(modified):

When email.utils is not available it can check date using datetime as a failback..

datetime.strptime(modified, '%a, %d %b %Y %H:%M:%S GMT')

What do you think?

Contributor

athoik commented Dec 2, 2013

Could you please make a function for check http date into utils?

Because not every machine has email.utils package, especially in embedded machines.

            # Only save in cache if a valid date is given
            if email.utils.parsedate(modified):

When email.utils is not available it can check date using datetime as a failback..

datetime.strptime(modified, '%a, %d %b %Y %H:%M:%S GMT')

What do you think?

@chrippa

This comment has been minimized.

Show comment
Hide comment
@chrippa

chrippa Dec 2, 2013

Owner

Hmm, I guess checking the date format is not really necessary. The HTTP spec does not even seem to specify a specific date format, it's simply advised to use the Last-Modified header from the server, whatever that may be. But since we write this to disk we should probably check that it's not abnormally long at least.

Owner

chrippa commented Dec 2, 2013

Hmm, I guess checking the date format is not really necessary. The HTTP spec does not even seem to specify a specific date format, it's simply advised to use the Last-Modified header from the server, whatever that may be. But since we write this to disk we should probably check that it's not abnormally long at least.

@vadmium

This comment has been minimized.

Show comment
Hide comment
@vadmium

vadmium Dec 2, 2013

Yes, verifying the format is not strictly necessary. I was a bit uncertain whether I should include it, but I though it would be good to avoid the chance of storing arbitrary cookie data. If it is a real problem it might be simplest to drop the check, or just check the length or something.

According to RFC 2616, the Last-Modified header (section 14.29) is supposed to be a HTTP-date format, specified in section 3.3.1 as one preferred format and two alternative formats. Using email.utils.parsedate() seemed to cover all three formats.

@athoik, it shouldn’t be hard to extend your strptime() suggestion to try all three formats, if that is needed. I would tend to try time.strptime() rather than datetime.strptime(), because time is already used in Livestreamer and datetime is not. Do you have a list of what modules may not available, or are you just trying to keep the number of different modules down?

vadmium commented Dec 2, 2013

Yes, verifying the format is not strictly necessary. I was a bit uncertain whether I should include it, but I though it would be good to avoid the chance of storing arbitrary cookie data. If it is a real problem it might be simplest to drop the check, or just check the length or something.

According to RFC 2616, the Last-Modified header (section 14.29) is supposed to be a HTTP-date format, specified in section 3.3.1 as one preferred format and two alternative formats. Using email.utils.parsedate() seemed to cover all three formats.

@athoik, it shouldn’t be hard to extend your strptime() suggestion to try all three formats, if that is needed. I would tend to try time.strptime() rather than datetime.strptime(), because time is already used in Livestreamer and datetime is not. Do you have a list of what modules may not available, or are you just trying to keep the number of different modules down?

@raylu

This comment has been minimized.

Show comment
Hide comment
@raylu

raylu Dec 3, 2013

What embedded system doesn't ship the email.utils module? It's part of the standard library.

raylu commented Dec 3, 2013

What embedded system doesn't ship the email.utils module? It's part of the standard library.

@athoik

This comment has been minimized.

Show comment
Hide comment
@athoik

athoik Dec 3, 2013

Contributor

It is not installed by default when using openembedded with OpenPLi.

Off course it is available on OpenPLi but this is not true for every embedded system.

python-email - 2.7.3-r5.3 - python version 2.7.3-r5.3  Python Email Support

When you have only few MB of memory and free space, every KB counts!

# df -h
Filesystem                Size      Used Available Use% Mounted on
/dev/root                60.0M     54.8M      5.2M  91% /

# free
             total         used         free       shared      buffers
Mem:        136004       122440        13564            0         1876
-/+ buffers:             120564        15440
Swap:            0            0            0

Here are the depends of livestreamer (and requests).

# opkg info livestreamer
Package: livestreamer
Version: 1.7.0-r0
Depends: python-pkgutil, python-shell, python, python-ctypes, python-requests, python-subprocess, python-core, python-misc

# opkg info python-requests
Package: python-requests
Version: 1.2.3-r3
Depends: python, python-json, python-codecs, python-core, python-io, python-compression, python-zlib

What i propose is to use standard time.strptime when no email.utils is installed.

Contributor

athoik commented Dec 3, 2013

It is not installed by default when using openembedded with OpenPLi.

Off course it is available on OpenPLi but this is not true for every embedded system.

python-email - 2.7.3-r5.3 - python version 2.7.3-r5.3  Python Email Support

When you have only few MB of memory and free space, every KB counts!

# df -h
Filesystem                Size      Used Available Use% Mounted on
/dev/root                60.0M     54.8M      5.2M  91% /

# free
             total         used         free       shared      buffers
Mem:        136004       122440        13564            0         1876
-/+ buffers:             120564        15440
Swap:            0            0            0

Here are the depends of livestreamer (and requests).

# opkg info livestreamer
Package: livestreamer
Version: 1.7.0-r0
Depends: python-pkgutil, python-shell, python, python-ctypes, python-requests, python-subprocess, python-core, python-misc

# opkg info python-requests
Package: python-requests
Version: 1.2.3-r3
Depends: python, python-json, python-codecs, python-core, python-io, python-compression, python-zlib

What i propose is to use standard time.strptime when no email.utils is installed.

@vadmium

This comment has been minimized.

Show comment
Hide comment
@vadmium

vadmium Dec 3, 2013

Thanks for the list of dependencies, that helps me understand the situation. I was expecting that python-requests already depended on the email package, but in Python 2.7 the httplib module imports mimetools instead. (I know that the Python 3 version does use email.)

To avoid fragmentation and to keep things simple, I’d probably go with the single custom implementation unconditionally, rather than checking for the email.utils module. But I just realised that strptime() is probably no good because because it depends on the locale.

vadmium commented Dec 3, 2013

Thanks for the list of dependencies, that helps me understand the situation. I was expecting that python-requests already depended on the email package, but in Python 2.7 the httplib module imports mimetools instead. (I know that the Python 3 version does use email.)

To avoid fragmentation and to keep things simple, I’d probably go with the single custom implementation unconditionally, rather than checking for the email.utils module. But I just realised that strptime() is probably no good because because it depends on the locale.

@athoik

This comment has been minimized.

Show comment
Hide comment
@athoik

athoik Dec 3, 2013

Contributor

Can you check the following if it works?

import time
import locale

# set locale to "C"
locale.setlocale(locale.LC_ALL, 'C')

modified = "Sun, 06 Nov 1994 08:49:37 GMT"
time.strptime(modified, '%a, %d %b %Y %H:%M:%S GMT')
modified = "Sunday, 06-Nov-94 08:49:37 GMT"
time.strptime(modified, '%A, %d-%b-%y %H:%M:%S GMT')
modified = "Sun Nov  6 08:49:37 1994"
time.strptime(modified, '%a %b %d %H:%M:%S %Y')

And here is the function for utils.

import time
import locale

RFC1123_DATE = '%a, %d %b %Y %H:%M:%S GMT'
RFC850_DATE = '%A, %d-%b-%y %H:%M:%S GMT'
ASCTIME_DATE = '%a %b %d %H:%M:%S %Y'

def parse_http_date(date):
    # set locale to "C"
    locale.setlocale(locale.LC_ALL, 'C')
    parsed = None
    for format in RFC1123_DATE, RFC850_DATE, ASCTIME_DATE:
        try:
            parsed = time.strptime(date, format)
            break
        except ValueError:
            pass
    return parsed

Also another implementation is here: https://github.com/django/django/blob/master/django/utils/http.py

MONTHS = 'jan feb mar apr may jun jul aug sep oct nov dec'.split()
__D = r'(?P<day>\d{2})'
__D2 = r'(?P<day>[ \d]\d)'
__M = r'(?P<mon>\w{3})'
__Y = r'(?P<year>\d{4})'
__Y2 = r'(?P<year>\d{2})'
__T = r'(?P<hour>\d{2}):(?P<min>\d{2}):(?P<sec>\d{2})'
RFC1123_DATE = re.compile(r'^\w{3}, %s %s %s %s GMT$' % (__D, __M, __Y, __T))
RFC850_DATE = re.compile(r'^\w{6,9}, %s-%s-%s %s GMT$' % (__D, __M, __Y2, __T))
ASCTIME_DATE = re.compile(r'^\w{3} %s %s %s %s$' % (__M, __D2, __T, __Y))

def parse_http_date(date):
    """
Parses a date format as specified by HTTP RFC2616 section 3.3.1.

The three formats allowed by the RFC are accepted, even if only the first
one is still in widespread use.

Returns an integer expressed in seconds since the epoch, in UTC.
"""
    # emails.Util.parsedate does the job for RFC1123 dates; unfortunately
    # RFC2616 makes it mandatory to support RFC850 dates too. So we roll
    # our own RFC-compliant parsing.
    for regex in RFC1123_DATE, RFC850_DATE, ASCTIME_DATE:
        m = regex.match(date)
        if m is not None:
            break
    else:
        raise ValueError("%r is not in a valid HTTP date format" % date)
Contributor

athoik commented Dec 3, 2013

Can you check the following if it works?

import time
import locale

# set locale to "C"
locale.setlocale(locale.LC_ALL, 'C')

modified = "Sun, 06 Nov 1994 08:49:37 GMT"
time.strptime(modified, '%a, %d %b %Y %H:%M:%S GMT')
modified = "Sunday, 06-Nov-94 08:49:37 GMT"
time.strptime(modified, '%A, %d-%b-%y %H:%M:%S GMT')
modified = "Sun Nov  6 08:49:37 1994"
time.strptime(modified, '%a %b %d %H:%M:%S %Y')

And here is the function for utils.

import time
import locale

RFC1123_DATE = '%a, %d %b %Y %H:%M:%S GMT'
RFC850_DATE = '%A, %d-%b-%y %H:%M:%S GMT'
ASCTIME_DATE = '%a %b %d %H:%M:%S %Y'

def parse_http_date(date):
    # set locale to "C"
    locale.setlocale(locale.LC_ALL, 'C')
    parsed = None
    for format in RFC1123_DATE, RFC850_DATE, ASCTIME_DATE:
        try:
            parsed = time.strptime(date, format)
            break
        except ValueError:
            pass
    return parsed

Also another implementation is here: https://github.com/django/django/blob/master/django/utils/http.py

MONTHS = 'jan feb mar apr may jun jul aug sep oct nov dec'.split()
__D = r'(?P<day>\d{2})'
__D2 = r'(?P<day>[ \d]\d)'
__M = r'(?P<mon>\w{3})'
__Y = r'(?P<year>\d{4})'
__Y2 = r'(?P<year>\d{2})'
__T = r'(?P<hour>\d{2}):(?P<min>\d{2}):(?P<sec>\d{2})'
RFC1123_DATE = re.compile(r'^\w{3}, %s %s %s %s GMT$' % (__D, __M, __Y, __T))
RFC850_DATE = re.compile(r'^\w{6,9}, %s-%s-%s %s GMT$' % (__D, __M, __Y2, __T))
ASCTIME_DATE = re.compile(r'^\w{3} %s %s %s %s$' % (__M, __D2, __T, __Y))

def parse_http_date(date):
    """
Parses a date format as specified by HTTP RFC2616 section 3.3.1.

The three formats allowed by the RFC are accepted, even if only the first
one is still in widespread use.

Returns an integer expressed in seconds since the epoch, in UTC.
"""
    # emails.Util.parsedate does the job for RFC1123 dates; unfortunately
    # RFC2616 makes it mandatory to support RFC850 dates too. So we roll
    # our own RFC-compliant parsing.
    for regex in RFC1123_DATE, RFC850_DATE, ASCTIME_DATE:
        m = regex.match(date)
        if m is not None:
            break
    else:
        raise ValueError("%r is not in a valid HTTP date format" % date)
@chrippa

This comment has been minimized.

Show comment
Hide comment
@chrippa

chrippa Dec 3, 2013

Owner

Since the Python documentation recommends not to mess with the locale in a library and the fact that we don't need to parse the date I've decided to just use a length check instead to keep it simple.

Owner

chrippa commented Dec 3, 2013

Since the Python documentation recommends not to mess with the locale in a library and the fact that we don't need to parse the date I've decided to just use a length check instead to keep it simple.

@vadmium vadmium deleted the vadmium:pv branch Dec 8, 2013

@K-S-V

This comment has been minimized.

Show comment
Hide comment
@K-S-V

K-S-V Dec 12, 2013

@vadmium Have you worked out hmac key yourself or was this post of some use? IIRC my forum post was the first one to describe the method and publish hmac key. this was later copied by pluzzdl and other softwares without mentioning the source or giving any credit whatsoever. if you have done the hard part yourself then kudos to you otherwise mentioning the source doesn't gonna hurt anyone.

K-S-V commented Dec 12, 2013

@vadmium Have you worked out hmac key yourself or was this post of some use? IIRC my forum post was the first one to describe the method and publish hmac key. this was later copied by pluzzdl and other softwares without mentioning the source or giving any credit whatsoever. if you have done the hard part yourself then kudos to you otherwise mentioning the source doesn't gonna hurt anyone.

@vadmium

This comment has been minimized.

Show comment
Hide comment
@vadmium

vadmium Dec 12, 2013

Hi @K-S-V. No I did not work out the HMAC key myself. I used the value
I found in your post and in the Pluzz code. I understand it is hidden
inside a separate SWF module loaded by the player, so I appreciate
your effort figuring it out, as well as explaining the verification
process. Sorry if I gave the wrong impression. We could probably put
in a short comment or something in the code saying where the key came
from, if you would like.

vadmium commented Dec 12, 2013

Hi @K-S-V. No I did not work out the HMAC key myself. I used the value
I found in your post and in the Pluzz code. I understand it is hidden
inside a separate SWF module loaded by the player, so I appreciate
your effort figuring it out, as well as explaining the verification
process. Sorry if I gave the wrong impression. We could probably put
in a short comment or something in the code saying where the key came
from, if you would like.

@athoik

This comment has been minimized.

Show comment
Hide comment
@athoik

athoik Dec 13, 2013

Contributor

Hi @K-S-V,

It's really good to see you here... Thank you for all the patches you are making available!

It would be really nice to examine @chrippa https://github.com/chrippa/python-librtmp, maybe there is an better way of supporting new sites, without the need of patching rtmpdump every time.

PS. Your latest patch doen't apply nicely after latest commits on rtmpdump

Contributor

athoik commented Dec 13, 2013

Hi @K-S-V,

It's really good to see you here... Thank you for all the patches you are making available!

It would be really nice to examine @chrippa https://github.com/chrippa/python-librtmp, maybe there is an better way of supporting new sites, without the need of patching rtmpdump every time.

PS. Your latest patch doen't apply nicely after latest commits on rtmpdump

@K-S-V

This comment has been minimized.

Show comment
Hide comment
@K-S-V

K-S-V Dec 22, 2013

@vadmium chrippa added the source reference in 6bba2e8.

@athoik yeah plugin system seems like a better way to add support for new sites. someone even provided the patchset for plugin system at rtmpdump mailing list but it never got merged into main branch.

My patches already contain the bugfixes mentioned in new commits. anyway i will provide the clean patch in due time.

K-S-V commented Dec 22, 2013

@vadmium chrippa added the source reference in 6bba2e8.

@athoik yeah plugin system seems like a better way to add support for new sites. someone even provided the patchset for plugin system at rtmpdump mailing list but it never got merged into main branch.

My patches already contain the bugfixes mentioned in new commits. anyway i will provide the clean patch in due time.

javiercantero pushed a commit to javiercantero/livestreamer that referenced this pull request Jan 7, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment