Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds video upload #1414

Merged
merged 14 commits into from Dec 28, 2020
Merged

Adds video upload #1414

merged 14 commits into from Dec 28, 2020

Conversation

fitnr
Copy link
Contributor

@fitnr fitnr commented Aug 11, 2020

This squashes the changes in #929 down to a single commit, incorporating comments from @Harmon758.

Some of this work is due @jamesandres and @Choko256.

This is currently in draft because I have poor internet for the next week and testing video upload isn't feasible. We've waited five years, so another week should be fine. 😄

Michael Chacaton and others added 9 commits May 5, 2017 09:50
* And safe import urllib in both py2/3

* rename methods to match API
* DRY max sizes, put in class in case twitter changes them
* use single media_upload for image and video, send to standard or
* chunked upload based on mime type and size

* finalize needs ModelParser, not RawParser
Add minimum chunk size of 16K, keeps number of chunks under 999
@Harmon758
Copy link
Member

I think it would be preferable to keep the original commits if possible, to keep credit where it's due.

@fitnr
Copy link
Contributor Author

fitnr commented Aug 14, 2020

How about I squash them all into a single commit with multiple authors?

@Harmon758
Copy link
Member

Is there a reason it needs to be a single commit?

@fitnr
Copy link
Contributor Author

fitnr commented Aug 14, 2020

Yes. The main branch has changed so much since this was introduced that rebasing each commit is tiresome.

@Harmon758
Copy link
Member

Wouldn't it be possible to use the original commits already in your video-upload2 branch and make a merge commit that merges the main branch and resolves any conflicts? I think if you really wanted to, you could even just copy all the code as it is in this commit and paste it as part of the merge commit. Although, it'd be preferable for the additional changes in this commit to be separate from changes for conflict resolution as well. Another reason for this is that it'd be easier to specifically review the changes you made in addition to what's already in #929 and the additional commits you have in video-upload2 than to re-review the entire thing.

@fitnr fitnr marked this pull request as ready for review August 22, 2020 22:37
@fitnr
Copy link
Contributor Author

fitnr commented Aug 22, 2020

I've force-pushed a new version that has the complete history.

@Harmon758 Harmon758 self-assigned this Aug 22, 2020
@Harmon758 Harmon758 self-requested a review August 22, 2020 22:47
@Harmon758 Harmon758 added Feature This is regarding a new feature Improvement This is regarding an improvement to an existing feature labels Aug 22, 2020
Copy link
Member

@Harmon758 Harmon758 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the merge commit itself had all the changes, I went ahead and re-reviewed the entire PR.

Although some of the issues I pointed out in my initial review of #929 were addressed, there seem to have been some regressions as well. Did you end up testing this PR yet?

My initial review that the interaction between upload_chunked and _chunk_media is messy also still stands.
_chunk_media should be split into three separate methods, corresponding to INIT, APPEND, and FINALIZE.
They don't need to be public methods, but each should be a bound API method so that the logic in upload_chunked can be refactored.

Also, I'm not sure how much of fitnr@4a3e2cf made it into this PR. Are there missing fixes/improvements from that commit?

@@ -5,14 +5,24 @@
import imghdr
import mimetypes
import os

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't necessary. Separating standard library imports from third party imports is consistent with PEP 8.

tweepy/api.py Outdated
max_size = 14649
size = os.path.getsize(filename)

if file_type == 'gif' or file_type in CHUNKED_TYPES:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first part of this or is redundant; 'gif' is in CHUNKED_TYPES.

I'm not sure what this if-else statement is for. The only cases where the conditional is false is when the file type is not supported or when imghdr.what is unable to determine the file type from the header of a valid image file type (e.g. #1411) and the fallback mimetypes.guess_type determines the file type instead.

tweepy/api.py Outdated Show resolved Hide resolved
tweepy/api.py Outdated Show resolved Hide resolved
tweepy/api.py Outdated Show resolved Hide resolved
tweepy/api.py Outdated Show resolved Hide resolved
tweepy/api.py Outdated Show resolved Hide resolved
tweepy/api.py Outdated Show resolved Hide resolved
tweepy/api.py Outdated Show resolved Hide resolved
tweepy/api.py Outdated Show resolved Hide resolved
@Harmon758 Harmon758 added the Need Follow-Up This needs to be followed up on to be actionable label Aug 22, 2020
@Harmon758 Harmon758 linked an issue Aug 23, 2020 that may be closed by this pull request
tweepy/api.py Outdated Show resolved Hide resolved
Copy link
Member

@Harmon758 Harmon758 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @Maradonna90 pointed out, these references weren't updated from #929.

tweepy/api.py Outdated Show resolved Hide resolved
tweepy/api.py Outdated Show resolved Hide resolved
tweepy/api.py Outdated Show resolved Hide resolved
@fitnr
Copy link
Contributor Author

fitnr commented Sep 6, 2020

Thank you @Harmon758 and @Maradonna90 for your help reviewing. I've incorporated your suggested changes.
In particular:

  • new global constants are used for tracking the MIN, MAX and DEFAULT chunk sizes. All of these constants are stored in KiB, and multiplied by 1024 when comparing to bytes.
  • Enhancement: the api.media_upload method will use chunked upload for images that exceed the standard upload size limit
  • Repeated logic around checking file types has been removed/simplified
  • I added two public domain sample files (gif and mp4) and three methods (and casettes) around the media upload endpoint.

@savetz
Copy link

savetz commented Sep 8, 2020

I am eager for the video upload feature, and respectfully ask that it be added to tweepy as soon as possible. Thank you all for your work on this project.

@Maradonna90
Copy link

Maradonna90 commented Sep 8, 2020

If I upload a bigger video and try to post a tweet with it I can get a 324 errorcode with not valid video. This happends because the upload hasn't finished yet.

I checked the twitter API and found a method that checks for the upload progress of media. I wrote a small function in my project to use it.

def get_media_upload_status(api, *args, **kwargs):
    """ :reference: https://developer.twitter.com/en/docs/twitter-api/v1/media/upload-media/api-reference/get-media-upload-status
            :allowed_param:
        """

    return bind_api(
            api=api,
            path='/media/upload.json',
            payload_type='media',
            allowed_param=['command', 'media_id'],
            upload_api=True,
            require_auth=True
    )(*args, **kwargs)

potentially the media_upload method should return the media_id when the media_upload finished. The above funciton could be used within a routine.

P.S: A quickfix obv is to use time.sleep(), but I think this is not intuitive for user to recognize and might cause significant slowdowns in the process (e.g waiting for 30s although the upload is finished after 5)

@vegit0

This comment has been minimized.

@savetz

This comment has been minimized.

@savetz

This comment has been minimized.

@Maradonna90

This comment has been minimized.

@savetz

This comment has been minimized.

@savetz

This comment has been minimized.

@fitnr fitnr requested a review from Harmon758 September 9, 2020 17:31
@fitnr
Copy link
Contributor Author

fitnr commented Sep 9, 2020

I've added a commit with @Maradonna90's get_media_upload_status, it's much more efficient than time.sleep(10)

@fitnr

This comment has been minimized.

Copy link
Member

@Harmon758 Harmon758 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to push this back to v4.0. This would be the only major feature to be added to v3.10, and with it being the last version to support Python 2.7 (and probably Python 3.5), I think it'd be better to have it be released with v4.0, where there's a lot of other new features planned, rather than including it in a version that I'd like to be as mature and stable as possible on release. I'd like to begin development of v4.0 as soon as possible, and if any bugs with v3.10 are found after there's commits to the master branch intended for v4.0, I'd probably have to make a separate v3.10.x branch to backport any fixes to release.

I also think there's still a lot of improvements to be made. For example, I'm still of my initial opinion that _chunk_media should be refactored into three separate methods.

Regardless, after v3.10 is released, I'll probably merge this into a new separate video-support branch and make a draft PR to merge that branch into the master branch. That will increase visibility and ease of access for anyone wanting to test the feature out before it's released. Also, that way, I'll be able to fix and improve any remaining issues that I find, without having to go through another (and potentially further) reviews, and of course, at that point, any PRs to that branch would be welcome, if anyone else, @fitnr included, wants to make any additional improvements.

@Harmon758 Harmon758 removed the Need Follow-Up This needs to be followed up on to be actionable label Dec 25, 2020
@Harmon758 Harmon758 added this to the 4.0 milestone Dec 25, 2020
This was referenced Dec 25, 2020
@Harmon758 Harmon758 changed the base branch from master to video-upload December 28, 2020 02:51
@Harmon758 Harmon758 merged commit 7487c20 into tweepy:video-upload Dec 28, 2020
@Harmon758
Copy link
Member

I've gone ahead and made a branch, https://github.com/tweepy/tweepy/tree/video-upload, and merged this into it.
I've also drafted PR #1486 merging it into the master branch.
I'll be making further improvements in that branch and PR, but as I said before, feel free to PR to that branch if anyone else wants to make additional improvements as well.

@vegit0 @savetz See https://tweepy.readthedocs.io/en/latest/install.html and https://pip.pypa.io/en/stable/reference/pip_install/#git.

@Harmon758 Harmon758 added Documentation This is regarding the library's documentation Tests This is regarding the library's tests labels Dec 28, 2020
@LuisMayo
Copy link

The [{'code': 324, 'message': 'Not valid video'}] error keeps happening
Shouldn't media_upload wait for it before returning?

@Harmon758
Copy link
Member

Are you using the video-upload branch / PR #1486? This PR has been superseded by that one.

Regardless, videos need to meet certain specifications for Twitter's API.

Shouldn't media_upload wait for it before returning?

Wait for what?

@LuisMayo
Copy link

I'm using the "video-upload" branch.

Ok let me explain.
When I'm calling

 uploaded_media = api.media_upload(output_filename, media_category='TWEET_VIDEO')

I expect the function to return when the state of the upload is no longer "pending".
Instead I think it's returning right after the "finalize" call. However, after calling finalize you still have to wait until Twitter has ended processing the video, as explained in the docs:

it may also be necessary to use a STATUS command and wait for it to return success before proceeding to Tweet creation.

I feel that the same way the API is handling all the process from start to finalize, it should also wait for STATUS not to be "pending" instead of having to handle that outside this library.

I have currently fixed it on my own code using a while loop and waiting for the proper state like this:

uploaded_media = api.media_upload(output_filename, media_category='TWEET_VIDEO')
while (uploaded_media.processing_info['state'] == 'pending'):
   time.sleep(uploaded_media.processing_info['check_after_secs'])
   uploaded_media = api.get_media_upload_status(uploaded_media.media_id_string)
api.update_status('@' + tweet.author.screen_name + ' ', in_reply_to_status_id=tweet.id_str, media_ids=[uploaded_media.media_id_string])

I hope it's clear now.
Thanks

@Harmon758
Copy link
Member

Ah, I see. Thanks for the feedback.
I think I'll probably add a kwarg to allow waiting for the async finalize process to finish.
I'll look into it later and let you know in #1486.

@Harmon758
Copy link
Member

The video-upload branch / pull request #1486 should be complete now.
Any feedback or review would be appreciated.

@fitnr fitnr deleted the video-upload-3 branch February 21, 2021 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation This is regarding the library's documentation Feature This is regarding a new feature Improvement This is regarding an improvement to an existing feature Tests This is regarding the library's tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support video upload
7 participants