Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Video upload (take 2) #929

Closed
wants to merge 9 commits into from
Closed

Conversation

jamesandres
Copy link

This is a more up to date version of #672 originally crafted by @Choko256 and @fitnr.

The original description from the previous PR:

As I mentioned in #655, this is an alternate way of adding video upload. The media_upload method gets a conditional, with smaller images routed to 'normal' upload, and larger images and videos chunked upload. I also moved upload limits to a class variable, to make them easier to change if the need arises.

This also fixes from unicode errors as flagged by @jeremylow.

@jamesandres jamesandres mentioned this pull request Sep 21, 2017
@jamiewinspear
Copy link
Contributor

I have found a bug with this - It currently only allows support for chunked media uploads related to public content. As it turns out the media_category parameter is not differentiated solely by content type ("image", "video" or "gif") but by whether the content is public or private also. It can also take values of dm_image, dm_gif or dm_video. I will be looking into fixing this tomorrow, but would hesitate to merge this before it is addressed.

@TheEssem
Copy link

It's been addressed, now what?

@jamiewinspear
Copy link
Contributor

At this point we are just waiting for a merge, we have been using this in production from the time I committed that last fix with no issues.

@TheEssem
Copy link

TheEssem commented Oct 17, 2017 via email

@fitnr
Copy link
Contributor

fitnr commented Oct 17, 2017

@TheEssemCraft There aren't any conflicts between this branch and origin/master. There have only been three pull requests merged since this branch was rebased onto master, all of them relatively minor.

@fitnr
Copy link
Contributor

fitnr commented Nov 1, 2017

@jamesandres @jamiewinspear
I've found and fixed a bug when uploading images from file handles. The name of the parameter was inconsistent (file or f). While fixing, I found it useful to move the functionality for checking filesize to a function in utils.py. The update is in my fork, I suggest that the commits should be added to this pull.

@guilouro

This comment has been minimized.

@fitnr

This comment has been minimized.

@guilouro

This comment has been minimized.

@braian87b

This comment has been minimized.

Copy link

@braian87b braian87b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works perfecty!!!

@corbindavenport

This comment has been minimized.

@braian87b

This comment has been minimized.

@Xpktro

This comment has been minimized.

@braian87b

This comment has been minimized.

@braian87b

This comment has been minimized.

@corbindavenport

This comment has been minimized.

@braian87b

This comment has been minimized.

@jh69

This comment has been minimized.

@braian87b

This comment has been minimized.

@gloriouskilka

This comment has been minimized.

@Spyder-exe

This comment has been minimized.

@fitnr

This comment has been minimized.

@braian87b

This comment has been minimized.

@corbindavenport

This comment has been minimized.

@braian87b

This comment has been minimized.

@Spyder-exe

This comment has been minimized.

@iandow

This comment has been minimized.

@alsuren
Copy link
Contributor

alsuren commented Apr 8, 2019

I just rebased on top of master and force-pushed. The only conflict was in a docstring.

We have been using this in prod since 2017 and it seems fine. I will try to deploy this rebased version tomorrow.

We have a little wrapper around it that shouldn't be contributed upstream, but is worth mentioning here:


def _media_upload(connection, name, file_obj=None, is_direct_message=False):
    """Wrap API connection.media_upload to check for readiness

    Media that are to be attached to Tweets must first be uploaded to Twitter.
    If one attempts to use the media before Twitter has finished processing
    it then the result is an internal server error. The proper solution would
    be to check whether or not a media upload has completed before using the
    media.

    When a media_upload call is made to the Twitter API we check for a
    'pending' or 'in_progress' state. If the request hasn't completed then we
    wait for the suggested time (called 'check_after_secs') before returning
    the media reference.

    What could still go wrong? The request could still be pending or
    in_progress after the delay, and this doesn't recheck the status. To be
    able to check the status we need to be able to call GET media/upload
    (STATUS). The API library doesn't currently support this, and adding it
    here seems a hack too far.
    """

    # HACK: This should really be handled in the API library layer
    media = connection.media_upload(
        name, file=file_obj, is_direct_message=is_direct_message)
    blocking_statuses = ['pending', 'in_progress']
    if (hasattr(media, 'processing_info') and
            media.processing_info['state'] in blocking_statuses):
        check_after = media.processing_info['check_after_secs']
        time.sleep(check_after)

    return media

@Harmon758 Harmon758 added the Feature This is regarding a new feature label Apr 28, 2019
@Harmon758 Harmon758 force-pushed the master branch 5 times, most recently from cf54e31 to d538993 Compare August 27, 2019 04:24
This was referenced Sep 2, 2019
@fitnr fitnr mentioned this pull request Sep 11, 2019
@umar13893
Copy link

umar13893 commented Nov 30, 2019

Hi, any solution for this? My status updating on twitter profile without video, for the image it works fine.

EDIT: It seems it's working fine but for some reasons, I'm getting "Not valid video" error.

@awcarlsson
Copy link

Any updates on the "Not valid video" error? Every video I try to upload seems to run into this.

@umar13893

This comment has been minimized.

@Shumaister
Copy link

Hello there,

i read a lot of issues here and, I try this

upload_result = api.media_upload('TestVideo.mp4', file='/myPath/TestVideo.mp4')
api.update_status(status="Texto del tuit", media_ids=[upload_result.media_id_string])

And I've got this message:

raise TweepError('Invalid file type for image: %s' % file_type)
tweepy.error.TweepError: Invalid file type for image: video/mp4

i don't know if I have an error or this is a problem with tweepy

@mattdonders
Copy link

In Discord, it was asked to put together a MCVE for this so that's here now. This grabs an MP4 from the NHL back-end API and tries to post it to Twitter.

It requires you to pass in consumer & access keys, but should be readily available -

python3 nhl-highlight-ripper-poster.py --twitter \
    --consumerkey "XXXXXXXX" --consumersecret "XXXXXXXX" \
    --accesstoken "XXXXXXXX" --accesssecret "XXXXXXXX"
import argparse
import requests
import shutil
import tweepy


parser = argparse.ArgumentParser()
parser.add_argument("--discord", help="log to console instead of file", action="store_true")
parser.add_argument("--disordwebhook", help="override team in configuration", action="store")
parser.add_argument("--twitter", help="log to console instead of file", action="store_true")
parser.add_argument("--consumerkey", help="override team in configuration", action="store")
parser.add_argument("--consumersecret", help="override team in configuration", action="store")
parser.add_argument("--accesstoken", help="override team in configuration", action="store")
parser.add_argument("--accesssecret", help="override team in configuration", action="store")
args = parser.parse_args()


def download_file(url):
    local_filename = url.split("/")[-1]
    with requests.get(url, stream=True) as r:
        with open(local_filename, "wb") as f:
            shutil.copyfileobj(r.raw, f)

    return local_filename


sample_video = download_file(
    "https://hlslive-wsczoominwestus.med.nhl.com/publish/80b062f2-4778-4224-8dbb-df201e7eb233.mp4"
)

payload = {"content": "NHL highlight video."}
video_file = open(sample_video, "rb")
files = {"file": video_file}

if args.discord:
    response = requests.post(args.discordwebhook, files=files, data=payload)

if args.twitter:
    consumer_key = args.consumerkey
    consumer_secret = args.consumersecret
    access_token = args.accesstoken
    access_secret = args.accesssecret

    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_secret)
    api = tweepy.API(auth)

    media_ids = [api.media_upload(sample_video).media_id_string]
    status = api.update_status(status="NHL highlight video.", media_ids = media_ids)

if not response.ok:
    print(response.json())

This is the error returned from code -

Traceback (most recent call last):
  File "nhl-highlight-ripper-poster.py", line 50, in <module>
    media_ids = [api.media_upload(sample_video).media_id_string]
  File "/Users/mattdonders/.pyenv/versions/3.6.5/lib/python3.6/site-packages/tweepy/api.py", line 200, in media_upload
    headers, post_data = API._pack_image(filename, 4883, form_field='media', f=f)
  File "/Users/mattdonders/.pyenv/versions/3.6.5/lib/python3.6/site-packages/tweepy/api.py", line 1322, in _pack_image
    raise TweepError('Invalid file type for image: %s' % file_type)
tweepy.error.TweepError: Invalid file type for image: video/mp4

Here is the ffmpeg output of the file -

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '80b062f2-4778-4224-8dbb-df201e7eb233.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.3.100
  Duration: 00:00:28.65, start: 0.000000, bitrate: 1335 kb/s
    Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 640x360 [SAR 1:1 DAR 16:9], 1261 kb/s, 59.96 fps, 59.96 tbr, 90k tbn, 119.88 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 64 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

Copy link
Member

@Harmon758 Harmon758 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interaction between upload_chunked and _chunk_media is messy.
_chunk_media should be split into three separate methods, corresponding to INIT, APPEND, and FINALIZE.
They don't need to be public methods, but each should be a bound API method so that the logic in upload_chunked can be refactored.

There are also conflicts now that need to be resolved.

Comment on lines +28 to +30
max_size_standard = 5120 # standard uploads must be less then 5 MB
max_size_chunked = 15360 # chunked uploads must be less than 15 MB

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason these aren't initialized in __init__ or better yet, constants outside the class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be 5 and 15 MiB rather than MB. Have these limits been tested?
The library is currently using limits of 4883 and 14649 KiB right now, corresponding to just over 5 and 15 MB.

from tweepy.utils import list_to_csv

IMAGE_MIMETYPES = ('image/gif', 'image/jpeg', 'image/png', 'image/webp')
CHUNKED_MIMETYPES = ('image/gif', 'image/jpeg', 'image/png', 'image/webp', 'video/mp4')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CHUNKED_MIMETYPES = ('image/gif', 'image/jpeg', 'image/png', 'image/webp', 'video/mp4')
CHUNKED_MIMETYPES = IMAGE_MIMETYPES + ('video/mp4',)

"""
f = kwargs.pop('file', None)

mime, _ = mimetypes.guess_type(filename)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been changed to use imghdr.what with #1086.
mimetypes.guess_type should probably only be used as a fallback in the case that imghdr.what fails to determine the type to check if the file is an mp4.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imghdr doesn't work on video files, so that's another reason to keep it as a backup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I was suggesting that mimetypes.guess_type be used as a fallback to check if the file is an mp4 in that case.
This has already been implemented in the master branch with 7c60edd to resolve #1411.

size = f.tell()
f.seek(0)

if mime in IMAGE_MIMETYPES and size < self.max_size_standard:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.path.getsize returns the size in bytes.
Given a filename, this will chunk any image greater than 5120 bytes rather than 5 MB.

raise TweepError("Can't upload media with mime type %s" % mime)

def image_upload(self, filename, *args, **kwargs):
""" :reference: https://dev.twitter.com/rest/reference/post/media/upload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
""" :reference: https://dev.twitter.com/rest/reference/post/media/upload
""" :reference: https://developer.twitter.com/en/docs/media/upload-media/api-reference/post-media-upload

try:
if file_size > (max_size * 1024):
raise TweepError('File is too big, must be less than %skb.' % max_size)
except os.error as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
except os.error as e:
except OSError as e:

Comment on lines +1494 to +1495
except os.error as e:
raise TweepError('Unable to access file: %s' % e.strerror)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.path.getsize is outside the try block and nothing else in it will raise OSError.

raise TweepError('File is too big, must be less than %skb.' % max_size)
f.seek(0) # Reset to beginning of file
fp = f
elif command != 'finalize':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other option is

Suggested change
elif command != 'finalize':
elif command == 'append':

Comment on lines +1512 to +1519
# video must be mp4
file_type, _ = mimetypes.guess_type(filename)

if file_type is None:
raise TweepError('Could not determine file type')

if file_type not in CHUNKED_MIMETYPES:
raise TweepError('Invalid file type for video: %s' % file_type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was already determined in media_upload and could be an image.
The file type should be passed to this method rather than redetermined here.

Comment on lines +1523 to +1570
if command == 'init':
query = {
'command': 'INIT',
'media_type': file_type,
'total_bytes': file_size,
'media_category': API._get_media_category(
is_direct_message, file_type)
}
body.append(urlencode(query).encode('utf-8'))
headers = {
'Content-Type': 'application/x-www-form-urlencoded; charset=utf-8'
}
elif command == 'append':
if media_id is None:
raise TweepError('Media ID is required for APPEND command.')
body.append(b'--' + BOUNDARY)
body.append('Content-Disposition: form-data; name="command"'.encode('utf-8'))
body.append(b'')
body.append(b'APPEND')
body.append(b'--' + BOUNDARY)
body.append('Content-Disposition: form-data; name="media_id"'.encode('utf-8'))
body.append(b'')
body.append(str(media_id).encode('utf-8'))
body.append(b'--' + BOUNDARY)
body.append('Content-Disposition: form-data; name="segment_index"'.encode('utf-8'))
body.append(b'')
body.append(str(segment_index).encode('utf-8'))
body.append(b'--' + BOUNDARY)
body.append('Content-Disposition: form-data; name="{0}"; filename="{1}"'.format(form_field, os.path.basename(filename)).encode('utf-8'))
body.append('Content-Type: {0}'.format(file_type).encode('utf-8'))
body.append(b'')
body.append(fp.read(chunk_size))
body.append(b'--' + BOUNDARY + b'--')
headers = {
'Content-Type': 'multipart/form-data; boundary=Tw3ePy'
}
elif command == 'finalize':
if media_id is None:
raise TweepError('Media ID is required for FINALIZE command.')
body.append(
urlencode({
'command': 'FINALIZE',
'media_id': media_id
}).encode('utf-8')
)
headers = {
'Content-Type': 'application/x-www-form-urlencoded; charset=utf-8'
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be combined with the prior if-elif statement.

@Harmon758 Harmon758 added Improvement This is regarding an improvement to an existing feature Need Follow-Up This needs to be followed up on to be actionable labels Jul 25, 2020
@Harmon758
Copy link
Member

@jamesandres @jamiewinspear @alsuren I've gone ahead and done a first pass at a review of the PR.

I've found and fixed a bug when uploading images from file handles. The name of the parameter was inconsistent (file or f). While fixing, I found it useful to move the functionality for checking filesize to a function in utils.py. The update is in my fork, I suggest that the commits should be added to this pull.

There also seem to be further improvements by @fitnr that haven't been included in this PR yet.

@braian87b @jh69 @corbindavenport @Spyder-exe @iandow @mjarrett You can simply install from this PR or the branch of the fork this PR is from rather than creating or using another fork with the same changes.

@umar13893 @awcarlsson Do your videos meet the criteria in the Video specifications and recommendations section of https://developer.twitter.com/en/docs/media/upload-media/uploading-media/media-best-practices?

@Shumaister @mattdonders Did you install Tweepy from this PR?
@mattdonders Line 1322 of api.py where your error occurs isn't even part of _pack_image in this PR, so you're likely using a version that's not this PR.

@jackyliang
Copy link

jackyliang commented Aug 8, 2020

Is this safe to pull to use prior to the v4 launch?

Also, sort of of new to patching libraries.. I installed Tweepy using pip. How could I install Tweepy from this PR?

@Harmon758
Copy link
Member

There shouldn't be any risk to using this PR, but there are still issues that need to be resolved.
Feel free to test it and report any issues you encounter.
Note though, that this PR seems to be stale now, and I'm considering working off of it to create a new branch/PR.

See https://tweepy.readthedocs.io/en/latest/install.html and https://pip.pypa.io/en/stable/reference/pip_install/#git.

@fitnr fitnr mentioned this pull request Aug 11, 2020
@Harmon758
Copy link
Member

Superseded by #1414

@Harmon758 Harmon758 closed this Dec 25, 2020
@Harmon758 Harmon758 added Stale This is inactive, outdated, too old, or no longer applicable and removed Need Follow-Up This needs to be followed up on to be actionable labels Dec 25, 2020
@Harmon758
Copy link
Member

The video-upload branch / pull request #1486 should be complete now.
Any feedback or review would be appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature This is regarding a new feature Improvement This is regarding an improvement to an existing feature Stale This is inactive, outdated, too old, or no longer applicable
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet