New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Twitter) Downloading gif media as gifs, downloading tweet text, and enhanced naming #4459
Comments
Note really, no. Any gif uploaded to Twitter gets automatically converted to mp4 on their servers and is only downloadable as such. (#2691)
You need a config file and use a
Use the filename option with the available metadata fields, which you can find with Things like
Wouldn't it be better to put all media of one user in a separate directory then?
I don't think Twitter provides metadata for that. {
"extractor": {
"twitter": {
"filename": "{author['name']}-{tweet_id}-{num}.{extension}",
"directory": ["{user['name']}"],
"postprocessors": [
{
"name": "metadata",
"format": "\fTF path/to/template",
"event": "post",
"filename": "{author['name']}-{tweet_id}-0Text.txt"
}
]
}
}
} |
Seems like this probably belongs into Discussions and not into Issues, but okay..
Well, guessing user intentions is generally not the strong suit of any computer program.. 😄 That said, not sure if there are actually any real GIF files on Twitter? Aren't they all MP4 now anyways?
Sure thing, simply set up a metadata post-processor in your config for Twitter, which would look a little something like this: {
"name": "metadata",
"event": "post",
"mode": "custom",
"content-format": "{retweet_id|tweet_id}:{content}",
"filename": "Twitter__{author[name]}__{date:%Y.%m.%d}__{retweet_id|tweet_id}.txt",
"directory": "Tweetcontent",
} The important part here is (Documentation of options for the metadata post-processor begin here)
Very easy, but you should probably familiarize yourself with the What you describe is probably just a simple setting for But if I were you, would at least add something like
Can't tell you off the top of my head if Twitter provides such metadata, but it probably provides more than enough.. Anyway, here is your most important gallery-dl advice: *: Not just for Twitter, all supported sites by gallery-dl |
Damn, beat me by a minute 😆 |
|
Writing documentation and keeping it up to date is not something I'm particularly good at ... |
Ah, so I was actually using the newer/suggested/canonical variant, that's good to know. 😄
Don't worry about that in my opinion, the documentation is actually really good! It's a documentation wasteland out there for so many projects, and gallery-dl is holding up very very well here. |
Thank you all for your feedback. I appreciate the time.
I am completely expecting it to be me simply going about this incorrectly, and I apologize for taking more time in asking how to do this correctly, but I'll attach the modified twitter.py file I have just to help see where I messed up. (Uploading as .txt) (Edit |
Ah, I guess there has been a slight misunderstanding, my bad. You certainly do not need to edit any python files, at all. Take a look at the Configuration section of the main readme, it's basically just two short paragraphs. There are also two example I also just remembered the wiki page, giving a quick rundown and explaining the necessary things even for newcomers: |
Well, not really more "optimal" or better. They both work the same way, they are functionally the same |
Ah, that helps. Thank you for clarifying.
I got some of it to work, but am having a bit of difficulty. Pasting what I had as is gives me this error:
Removing the commas at the end of |
Okay, so pretty good news on this, I got it to work pretty much as desired (although not using the format I intended that shows up in an empty config file), but yeah I got names mostly working and text files downloading. Here is my config:
I did intentionally set Directory to null to get txt files in the same dir as the media, which I'm happy about
Similar to that, what can be done in a text file by configs alone? Would it be tricky to get it to do something that looks like this:
I know that the Date can be given, but I notice that with the content as it is in my configs, it groups the hashtags along with the tweet message. Is this possible to have on a separate line? I apologize if I am asking a lot of questions, I am really just happy that a tool that can do this exists at all and want to use it to its best abilities. |
Yup, exactly. You have trailing commas in two places in the snippet above. Also, if you don't use any settings (which is fine, the defaults are designed to be reasonable) for The problem with your config example here, while it is correct in principle (besides that you have to give a name to a post-processor set up in that place), is that you create a post-processor for metadata, but you are not using any post-processor when running gallery-dl with a Twitter URL.. |
Your Example
{
"extractor": {
"twitter": {
"filename": "{author['name']}-{tweet_id}-{num}Media{extension}.{extension}",
"directory": ["{user['name']}"],
"username": "Twitter032",
"password": "Twixer(These are not real)",
"postprocessors": [{
"name": "metadata",
"event": "post",
"mode": "custom",
"content-format": ["https://twitter.com/{author[name]}/status/{retweet_id|tweet_id}",
"Author: @{author[name]}",
"Posted: {date:%Y/%m/%d_%I:%M%p}",
"",
"{content}"],
"filename": "{author[name]}-{retweet_id|tweet_id}-0Text.txt",
"directory": null
}]
}
}
} Okay, you already found it. 😄
Yes! You can set "filename": {
"extension == 'png'": "{tweet_id}-{num}-Photo.{extension}",
"extension == 'mp4'": "{tweet_id}-{num}-Video.{extension}",
"" : "{author['name']}-{tweet_id}-{num}Media{extension}.{extension}"
} The last line is the default condition, it's what will be used if the checks in the two lines before don't evaluate to true. Although be careful to not use an overly generic filename, i.e. never just like
Dunno, don't know the entirety of the Twitter metadata by heart..
Not tricky at all, take a look at Special Type Format String on how to set up a template file. A normal template should be enough, e.g. And then set up this template file like you want, e.g
and so on.. |
Sorry for taking a bit to respond, here is my progress:
Regarding the Filename, I am delighted to say that, as of so far, it works precisely as desired. Thank you for this. I think the only thing stopping me from getting any farther with naming is the lack of available metadata on twitters side for stuff like tagging things as sfw/nsfw, and applying tags about the image itself automatically. I am really glad I am able to get this far here. (I did have to add a comma at the end of the } in your example though to get it to work, but that was thankfully a super easy fix. Maybe python isn't as hard to learn after all) I did attempt to include Do you happen to have the code that counts for screen names? I haven't been able to find it. Tried looking it up but I have had no luck. (Screen names being like, [Screen name]@handle_name, basically the nickname for twitter). I tried looking at the Special Type Formatting String you mentioned, but I am not really sure I understand. What I do understand is that you recommend I use the T type Format String ( |
And JSON is even easier... 😆
Check with
Yes, it's exactly that. |
Very excitingly I did find it via -K - Also I noticed something in the filename thingy, using the same format as above, for some reason or another, extensions that end in mp4 are still being given the "Photo" tag, despite the config clarifying to render mp4's as "Video". Hmmm (I haven't tested the template yet, just wanted to update on the nickname bit) |
Ah, I see what you mean. And |
Ah, that worked. I never realized this chat could update live =0
Yep got it working lol. Put it in /home/twitter.txt
and in config.json: |
Here's my latest update to my txt contents:
I added some additional metadata and edited the date to just Only thing I didn't see during my -K was the views in a tweet, nor the device used to send the tweet, which was a tad disappointing but it's fine. I wonder also what I can use for getting timezones in the mix, just for date clarity |
Hey, how possible is it to have multiple files created?
The idea I had for the FileNames was to determine the last time the data has been modified, and if unchanged during a gallery-dl command, just to ignore making the file (I assume null would work), and if different to then create the file. Idk what it'd look like to actually check for this though. All this while also downloading the Profile Banner/Image in the mentioned {author[name]}-Profile dir in the same fashion as the text file? |
Would it be possible for gallery-dl to detect what media is intended to be a gif file and convert it into such instead of an mp4?
Alongside this, would it also be possible to download the text associated with a tweet as a .txt file, and within said file include the tweet metadata? (Example of .txt file contents in attached file)
Example.txt
And finally, for file organizing, how possible is it to set tweet names to something like Author_TweetID?
Here's an example name I came up with awhile back:
(File Names):
Doe-1234567890-0Text.txt
Doe-1234567890-1MediaImage.png/jpeg
Doe-1234567890-2MediaMov.mp4/webm/etc
Doe-1234567890-3MediaGif.gif
My reason for adding Media before the type is for ease of searching, if you wanted to search for any kind of media vs just a specific type, you could just search for "Media", and if you wanted only a specific type, can instead search for that.
Additionally, my reason for starting the Filename with the Author would be for ease of organizing as well, since if you sort it alphanumerically, you will be able to group the tweets by author. And since it's followed by the tweet ID, each tweet will be chronologically organized per author.
Additionally, is it at all possible to apply tags to a tweets filename? IE whether it's NSFW, Safe, or inbetween (I would call this SNFW for (Safe, but Not For Work, open for ideas). These names being exactly as such so as not to get (N)SFW results for typing SFW instead of Safe.
I doubt tagging media details would be possible since even on sites that support this, it is applied manually by the user/admins, but if it were possible it would be neat
The text was updated successfully, but these errors were encountered: