Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script to automatically generate video_ID.txt file #6

Open
RhetTbull opened this issue Sep 10, 2022 · 1 comment
Open

Script to automatically generate video_ID.txt file #6

RhetTbull opened this issue Sep 10, 2022 · 1 comment

Comments

@RhetTbull
Copy link

RhetTbull commented Sep 10, 2022

Hi. First, thank you very much for this project! It saved me a huge amount of time to download some videos. I created the following script to automatically generate the video_ID.txt file and so I'm posting it here in case it's helpful for others. It monitors the clipboard for a copied link then writes the video id to the video_ID.txt file. Just run the script then copy your links to the clipboard. The script will find the video ID and write it to the `video_ID.txt file. Press Ctrl+C to exit when done.

You'll need to install pyperclip with pip install pyperclip

"""Given a video link for a Wistia video, create a video_ID.txt for use with
wistia-downloader:
https://github.com/abdlalisalmi/wistia-downloader/tree/master/script

To get the video link, right click on the video and select "Copy link and thumbnail"
which will put something like this on the clipboard:

<p><a href="https://originalurl.com;wvideo=01jkcwXb9u">
<img src="https://embed-ssl.wistia.com/deliveries/88xxx1231e5e0c7a1b141a68c1.jpg?
image_play_button_size=2x&amp;image_crop_resized=960x540&amp;image_play_button=1&amp;image_play_button_color=174bd2e0"
style="width: 400px; height: 225px;" width="400" height="225"></a>
</p><p>
<a href="https://originalurl.com;wvideo=01jkcwXb9u">
URL Name</a></p>

The part we care about is wvideo=01jkcwXb9u and the ID we need is "01jkcwXb9u"

"""

import os.path
import re
import shutil
from time import sleep, time

import pyperclip

OUTPUT_FILE = "video_ID.txt"


if __name__ == "__main__":
    if os.path.exists(OUTPUT_FILE):
        # backup the file
        backup_file = f"video_ID_{int(time())}.txt"
        print(f"Backup up {OUTPUT_FILE} to {backup_file}")
        shutil.copy(OUTPUT_FILE, backup_file)

    count = 0
    first = True
    try:
        with open(OUTPUT_FILE, "w") as f:
            print("Paste link from video 'Copy link and thumbnail' here and hit Enter")
            print("Press ctrl+C when you are done")
            last_buffer = ""
            while True:
                buffer = pyperclip.paste()
                if buffer == last_buffer:
                    sleep(0.1)
                    continue
                last_buffer = buffer
                if match := re.search(r"wvideo=([\w]+)\"", buffer):
                    print(f"Found video ID: {match[1]}")
                    f.write(f"{match[1]}\n")
                    f.flush()
                    count += 1
                elif first:
                    first = False
                else:
                    print("Did not find a video ID")
    except KeyboardInterrupt:
        print(f"Found {count} video IDs and wrote them to {OUTPUT_FILE}")
@AlexanderMcColl
Copy link

I found this software impossible to use and gave up, but your python script was fantastic thank you!!!

I referenced this gist: https://gist.github.com/szepeviktor/2a8a3ce8b32e2a67ca416ffd077553c5
to create a python script that then takes your list of Video_IDs and returns a list of .mp4 urls to then pipe into whichever way is most convenient for you to download (I have a homelab that I can leave running overnight iterating through a list of urls to download):

import requests
import re
import json

# Read the list of Video_IDs
video_ID = open('video_ID.txt','r').read().split('\n')
video_urls = []

# Download the webpage
with open("video_urls.txt", "a") as f:
    for video in video_ID:
        url = f"http://fast.wistia.net/embed/iframe/{video}"
        response = requests.get(url)
        content = response.text

        # Extract the URL using regular expressions
        pattern = r'^\s*W\.iframeInit\((.+), {[^}]*}\);\s*$'
        match = re.search(pattern, content, re.MULTILINE)
        if match:
            json_str = match.group(1)
            data = json.loads(json_str)
            url = data["assets"][0]["url"]
            url = re.sub(r'\.bin$', '.mp4', url)
            # Write it to the output file specified above
            f.write(url + "\n")

GPT-3.5 wrote me the python code first time from their shell code. I just added the with and for loops so it can iterate. Unbelievable how good it is for getting little scripts going.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants