# TikTok-YouTube Project


The overall aim of this project is to pull trending videos from TikTok and create a compilation video which is automatically published to YouTube. This will hopefully result in some residual income as i aim to publish videos 4-6 times daily when autonomous.

## Load Modules

In [None]:
import http.client # Query API
import pandas as pd # Manipulate Dataframe
import json # Manipulate JSON objects
import urllib.request # Download videos from URL
from os import listdir # List files in a directory
from os.path import isfile, join # Joining and filtering files
from moviepy.editor import * # Write final video to drive

## Retrieve Video Data

This code queries the TikTok API from RapidAPI and pulls video data based on a user-selected hashtag. There is scope to have this hashtag auto generated some way, maybe another API query and selected the most popular and then perform the query to pull the data.

In [None]:
# Connect to API
conn = http.client.HTTPSConnection("tiktok82.p.rapidapi.com")

# API keys for RAPID API - Will try to get these stored secretly
headers = {
    'X-RapidAPI-Key': "ee9f9a0a74mshdf11304da3f9e15p15be10jsna76036e001f3",
    'X-RapidAPI-Host': "tiktok82.p.rapidapi.com"
    }

# Perform the request, selected hashtag can be changed as well as the cursor value
conn.request("GET", "/getChallengeVideos?hashtag=fyp&cursor=50", headers=headers)

# Get the response from API
res = conn.getresponse()

# Read the data
tiktok_hashtag_info = res.read()

## Manipulate Data

Here we convert the data into a dataframe and manipulate to the point where we are ready to download the videos from TikTok.

In [None]:
# Select the correct data from the download data -> item_list and convert to JSON
df = json.loads(tiktok_hashtag_info.decode("utf-8"))['data']['item_list']

# Convert the dictionary to a datframe
df = pd.DataFrame.from_dict(df)

# Select columns required
df = df[['id', 'desc', 'createTime', 'video', 'author', 'stats', 'authorStats']]

In [None]:
# Unnest the video column and remove it
df = pd.concat([df.drop('video', axis=1),
                pd.DataFrame(df['video'].tolist())], axis=1)

# Unnest the stats column and remove it
df = pd.concat([df.drop('stats', axis=1),
                pd.DataFrame(df['stats'].tolist())], axis=1)

# Unnest the authorStats column and remove it
df = pd.concat([df.drop('authorStats', axis=1),
                pd.DataFrame(df['authorStats'].tolist())], axis=1)

# Unnest the author column and remove it, we need to join this back to dataframe
# as the id column represents the author id here and not video id so column gets
# removed if done like the steps above
df = pd.concat([df.drop('author', axis=1),
                pd.json_normalize(df['author'])], axis=1)

In [None]:
# Combine the first ID column with the rest of the desired columns, have to use df.iloc[:,0] as there are
# multiple columns called id and we just want the first one in the dataframe
df = pd.concat([df.iloc[:,0], df[['desc', 'createTime', 'height', 'width', 'duration', 'downloadAddr',
   'shareCount', 'commentCount', 'playCount', 'followerCount', 'heartCount',
   'uniqueId', 'nickname']]], axis = 1)

# Sort the dataframe - dont know whether to keep as video download seems to do normal order anyway
# df = df.sort_values(['playCount', 'shareCount', 'followerCount'], ascending = False)

In [None]:
# Create the download url based on what is used in the API example online
df['videoUrl'] = "/getDownloadVideoWithWatermark?video_url=https%3A%2F%2Fwww.tiktok.com%2F%40" + df['uniqueId'] + "%2Fvideo%2F" + df['id']

In [None]:
# Save the dataframe at this point to edit
# df.to_csv("storedData.csv")

## Download Videos

Code below downloads video with watermark.

In [None]:
# Pull the column containing the urls to download videos
vids_to_download = df['videoUrl']

In [None]:
# Connection to API
conn = http.client.HTTPSConnection("tiktok82.p.rapidapi.com")

# Again add the headers
headers = {
    'X-RapidAPI-Key': "ee9f9a0a74mshdf11304da3f9e15p15be10jsna76036e001f3",
    'X-RapidAPI-Host': "tiktok82.p.rapidapi.com"
    }

# Location to save the videos before combining them
loc = "./VidsForUpload/"

# Initialise the loop
i = 0

# Loop through files and load the videos and combine into a list
while i < len(vids_to_download):
    # Store files as form vid_09.mp4 etc...
    filename = "vid_" + f'{i:02d}' + ".mp4"
    # Return video url
    vid_url = vids_to_download[i]
    # Combine the location and filename to get save location
    save_location = loc + filename
    
    # Get the link for the video download
    conn.request("GET",
                 vid_url,
                 headers=headers)
    
    res = conn.getresponse()
    data = res.read()
    
    # Return the HQ download link
    HQ_url = json.loads(data.decode("utf-8"))['video_url_HQ']
    url_link = HQ_url
    
    # Download the file from the link
    urllib.request.urlretrieve(url_link, save_location) 
    i = i + 1

## Combine Videos and Save

Keep any files stored in selected directory which are of the usual type. This is done as within a sample of videos downloaded from TikTok, the majority were MP4 but one was MOV.

In [None]:
# Selected folder containing sample videos
mypath = "./VidsForUpload/"
usual_types = ("MP4", "MOV", "WMV", "AVI", "AVCHD",
               "FLV", "F4V", "SWF", "MKV", "WEBM", "HTML5",
              "mp4")

# Get the files then filter them so only those ending with above are kept.
video_files = [f for f in listdir(mypath) if isfile(join(mypath, f))]
video_files = [val for val in video_files if val.endswith(usual_types)]

The following appends all videos in the location together. This may change so that i can add directly from URL without saving the videos to my harddrive.

In [None]:
# Start the loop at 0 and create empty list the length of the files
i = 0
clips = list(range(len(video_files)))

# Loop through files and load the videos and combine into a list
while i < len(video_files):
    clips[i] = VideoFileClip(mypath + video_files[i])
    i = i + 1

# Combine all the videos together
combined_vids = concatenate_videoclips(clips)

In [None]:
# Write the video
combined_vids.write_videofile("video_for_upload.mp4")