Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
How To Add "date", "title", "description", & "tags" to a TEXT file? #26831
Comments
|
is this possible? any other similar solutions? thx. |
|
Well, there are the Copy this code into a file and run in from the command line like this: #!/bin/bash
# Get the video link from the command line input.
LINK_STR="$1"
TITLE_TEMPLATE="%(title)s.%(ext)s"
# Path to the folder you want to download to.
# This could be substituted with "$2" then include the output folder as the second argument when executing this file.
OUTPUT_PATH="/example/output/path"
cd $OUTPUT_PATH
# Get the title of the video that will be downloaded to set the output filename.
# NOTE: --restrict-filenames removes any invalid filename characters from the title.
OUTPUT_TITLE="$(youtube-dl -o $TITLE_TEMPLATE --get-title --restrict-filenames $LINK_STR)"
# Get the json info and put select info in a text file.
# "|" pipes the outputs of each command to the next one.
# Get a string of all the json info for the input video:
# "$(youtube-dl --dump-json $LINK_STR)"
# Make our own output object with custom keys. (See https://jqplay.org for a breakdown.)
# NOTE: tags are an array so it needs to be run through | join(", ") or else the tag writing fails.
# NOTE: the date comes through as yyyymmdd so format it by selecting substrings and inserting dashes between them.
# "jq -r '{ Date: .upload_date | (.[0:4] + "-" + .[4:6] + "-" + .[6:8]), Title: .fulltitle, Description: .description, Tags: .tags | join(", ") }'"
# Format our custom object be setting a leading "key" and "value" so we can wrap the value in quotes.
# " | jq -r 'to_entries | .[] | .key + ": \"" + .value + "\""'
# Write the output to a new file.
# > "$OUTPUT_TITLE.txt"
echo "$(youtube-dl --dump-json $LINK_STR)" | jq -r '{ Date: .upload_date | (.[0:4] + "-" + .[4:6] + "-" + .[6:8]), Title: .fulltitle, Description: .description, Tags: .tags | join(", ") }' | jq -r 'to_entries | .[] | .key + ": \"" + .value + "\""' > "$OUTPUT_TITLE.txt"Which produces a file like this:
Woah, didn't expect to see porn, sexy, and warfare on there |
|
Wow, Thank you @Fetchinator7 feels a little (honestly, a lot) over my head but i will give it a try. i think you're right, given the lack of comments here & over on stack... it seems the json dump (--write-info-json) is my only option. i did that & it contains SO much other stuff that i don't know what it all is. if your script doesn't work, perhaps i can find some mac app or script that can bulk/batch extract just those parts. what is "jq"? i won't have time to try this for a couple days, but i'll be sure to post back when i do. thanks for your generosity. |
|
i really do appreciate the time you took to respond to me. unfortunately, it seems to be a bit over my head :-( i'm trying to "batch" download all the videos from my playlists & am needing to make the process as easy as possible. i've managed to finally piece together this line that works great to do the actual downloading... but the JSON file that this produces is huge with tons of UNneeded stuff. i've dug thru it and to my surprise discovered that the Youtube API doesn't seem to include the "Tags" in this. i haven't been able to find any way to get the tags with these videos. oh well, not the end of the world. so, i guess since there doesn't seem to be a way to pre-process what i want into a text file, THEN... is it possible to "batch" all the JSON files that this downloads and delete everything in it EXCEPT the Date, Title, & Description? |
|
I added your download command so now this will generate the text files and download the videos. I'm really sorry, I forgot I made that file executable. You need to either run Take out the Remember, "tags" is all lowercase as seen in Keep in mind that #!/bin/bash
# Path to the folder you want to download to.
# This could be substituted with "$2" then include the output folder as the second argument when executing this file.
OUTPUT_PATH="/Volumes/LaCie/DUMPSTER"
cd "$OUTPUT_PATH" || exit
# Temp file with the video ids on new lines.
FILENAME="playlist_video_ids.txt"
# Playlist url from command line.
PLAYLIST_URL="$1"
# Get all the video ids in the playlist and put them on new lines in a text file.
echo "$(youtube-dl --get-id "$PLAYLIST_URL")" > "$FILENAME"
# Read the text file of video ids.
ALL_LINES="$(cat $FILENAME)"
# Download each video in the playlist individually.
for VIDEO_ID in $ALL_LINES ;
do
# Get the link to the video by adding the video id .
LINK_STR="https://youtu.be/$VIDEO_ID"
TITLE_TEMPLATE="%(upload_date)s %(title)s.%(ext)s"
TITLE_TEMPLATE_NO_EXT="%(upload_date)s %(title)s"
# Get the restricted filename of the video that will be downloaded to set the output filename.
# NOTE: --restrict-filenames removes any invalid filename characters from the title.
OUTPUT_TITLE="$(youtube-dl -o "$TITLE_TEMPLATE_NO_EXT" --restrict-filenames --get-filename "$LINK_STR")"
# Wait for 10 seconds to avoid youtube blocking our requests.
sleep 10
# Run your command to download the videos in the playlist.
YOUTUBE_DL_OUTPUT="$(youtube-dl -i -o "$TITLE_TEMPLATE" -f "(bestvideo[width>=1920])+bestaudio" --restrict-filenames --recode-video mp4 --add-metadata --embed-thumbnail --all-subs --embed-subs "$LINK_STR")"
echo "$YOUTUBE_DL_OUTPUT"
# Get the json info and put select info in a text file.
# "|" pipes the outputs of each command to the next one.
# Get a string of all the json info for the input video:
# "$(youtube-dl --dump-json $LINK_STR)"
# Make our own output object with custom keys. (See https://jqplay.org for a breakdown.)
# NOTE: tags are an array so it needs to be run through | join(", ") or else the tag writing fails.
# NOTE: the date comes through as yyyymmdd so format it by selecting substrings and inserting dashes between them.
# "jq -r '{ Date: .upload_date | (.[0:4] + "-" + .[4:6] + "-" + .[6:8]), Title: .fulltitle, Description: .description, Tags: .tags | join(", ") }'"
# Format our custom object be setting a leading "key" and "value" so we can wrap the value in quotes.
# " | jq -r 'to_entries | .[] | .key + ": \"" + .value + "\""'
# Write the output to a new file.
# > "$OUTPUT_TITLE.txt"
echo "$(youtube-dl --dump-json "$LINK_STR")" | jq -r '{ Date: .upload_date | (.[0:4] + "-" + .[4:6] + "-" + .[6:8]), Title: .fulltitle, Description: .description, Tags: .tags | join(", ") }' | jq -r 'to_entries | .[] | .key + ": \"" + .value + "\""' > "$OUTPUT_TITLE.txt"
done
# Delete the temporary file.
rm "$FILENAME"Then run it from the command line: NOTE: I don't think you want to keep the thumbnail but incase you do add this back in: |
|
@Fetchinator7 out of 205 videos in the first playlist...
...so any idea how i can make the process more foolproof? re: about the bestvideo+bestaudio bit... i specifically landed on what i had for that because after much trial'n'error'n'reading, discovered those look at the bitrate over pixel size. i was having a LOT of 720p versions download instead of the original 1080p version. so that way i had it written seemed to be the only way i found to obtain the original 1080p versions (then go down if that size didn't exist). would my removing your question: most of the titles on youtube, and within the descriptions, there are quotes. should i somehow strip those? wondering if that would cause any problems IF i wanted to import all these text files into a spreadsheet or if someday in the future there becomes a way to import these into a database of some sort. i noticed in the json's those inner quotes were escaped with a backslash. i get the purpose of having the whole value encased in quotes, so i probably should 'not' remove those, unless there's a better character to use besides quotes (perhaps curly brackets or backtics?) to keep the inner ones as-is for readability. your thoughts on that? and... SUPER HAPPY that i was wrong about the Tags not being available. i must've just missed it in the json amidst the plethora of other junk :-) again, THANK YOU SO MUCH for your help on this. i recognize it's above'n'beyond. is there some way i can compensate you for your time & expertise? ✌🏼 |
|
followup... i got adventurous with regards to my question about replacing the surrounding quotes. also of note, it takes a good 30-60'ish seconds to get started & more than the 10 seconds between each record. is that normal? i'm still pretty concerned about all the errors & missing stuff i described above. am seeking to make that more foolproof before i get started on my actual project; but i may need to just get going on it & go thru each batch like i did the above to find any problems. would obviously like to avoid that hassle ongoing if possible. any other help with all this would be greatly appreciated :-) |
|
@syberknight If you want to use this in a spreadsheet you should probably redo it for that format. I imagine there’s some json to spreadsheet converter that would be easier. |
|
@Fetchinator7 i will try it again by increasing the 10 seconds part & see what happens & report back. |
|
@syberknight I feel like this is going beyond the scope of this GitHub issue so do you want to join my new discord server or message Fetchinator7#9036 on discord? Or some other platform? |
|
@Fetchinator7 aside from those issues, it's working perfect ;-) i'm so sorry, and am so appreciative of your help. i'm not sure what a discord server is but i want to do what's appropriate. so i'm game... signing up now... |
Checklist
Question
WRITE QUESTION HERE...
i'm using youtube-dl in the Terminal on a Mac (Catalina) to download all my videos from YouTube.
i would really like to have the...
Date:
Title:
Description:
Tags:
...in a .txt file for each download, plain'n'simple :-)
i have figured out & installed homebrew to get ffmpeg & atomicparsley, and have read elsewhere that those 'could' be used to do such a thing, but cannot find nor figure out how to do it.
i know there's the "--write-info-json" option, but that's just WAY tooooo much.
i also am aware of the "--write-description" but that's obviously just the description.
alternatively, if there's a way to make the json file to ONLY include those 4 items, that would be acceptable too.
any help would be greatly appreciated!
thanks!