# TikTok-YouTube Project


The overall aim of this project is to pull trending videos from TikTok and create a compilation video which is automatically published to YouTube. This will hopefully result in some residual income as i aim to publish videos 4-6 times daily when autonomous.

## Load Modules

In [2]:
import http.client # Query API
import pandas as pd # Manipulate Dataframe
import json # Manipulate JSON objects
import urllib.request # Download videos from URL
from os import listdir # List files in a directory
from os.path import isfile, join # Joining and filtering files
from moviepy.editor import * # Write final video to drive

## Retrieve Video Data

This code queries the TikTok API from RapidAPI and pulls video data based on a user-selected hashtag. There is scope to have this hashtag auto generated some way, maybe another API query and selected the most popular and then perform the query to pull the data.

In [6]:
# Connect to API
conn = http.client.HTTPSConnection("tiktok82.p.rapidapi.com")

# API keys for RAPID API - Will try to get these stored secretly
headers = {
    'X-RapidAPI-Key': "ee9f9a0a74mshdf11304da3f9e15p15be10jsna76036e001f3",
    'X-RapidAPI-Host': "tiktok82.p.rapidapi.com"
    }

# Perform the request, selected hashtag can be changed as well as the cursor value
conn.request("GET", "/getChallengeVideos?hashtag=fyp&cursor=50", headers=headers)

# Get the response from API
res = conn.getresponse()

# Read the data
tiktok_hashtag_info = res.read()

## Manipulate Data

Here we convert the data into a dataframe and manipulate to the point where we are ready to download the videos from TikTok.

In [26]:
# Select the correct data from the download data -> item_list and convert to JSON
df = json.loads(tiktok_hashtag_info.decode("utf-8"))['data']['item_list']

# Convert the dictionary to a datframe
df = pd.DataFrame.from_dict(df)

# Select columns required
df = df[['id', 'desc', 'createTime', 'video', 'author', 'stats', 'authorStats']]

In [27]:
# Unnest the video column and remove it
df = pd.concat([df.drop('video', axis=1),
                pd.DataFrame(df['video'].tolist())], axis=1)

# Unnest the stats column and remove it
df = pd.concat([df.drop('stats', axis=1),
                pd.DataFrame(df['stats'].tolist())], axis=1)

# Unnest the authorStats column and remove it
df = pd.concat([df.drop('authorStats', axis=1),
                pd.DataFrame(df['authorStats'].tolist())], axis=1)

# Unnest the author column and remove it, we need to join this back to dataframe
# as the id column represents the author id here and not video id so column gets
# removed if done like the steps above
df = pd.concat([df.drop('author', axis=1),
                pd.json_normalize(df['author'])], axis=1)

In [29]:
df = pd.concat([df.iloc[:,0], df[['desc', 'createTime', 'height', 'width', 'duration', 'downloadAddr',
   'shareCount', 'commentCount', 'playCount', 'followerCount', 'heartCount',
   'uniqueId', 'nickname']]], axis = 1)
df = df.sort_values(['playCount', 'shareCount', 'followerCount'], ascending = False)

In [42]:
df['videoUrl'] = "/getDownloadVideoWithWatermark?video_url=https%3A%2F%2Fwww.tiktok.com%2F%40" + df['uniqueId'] + "%2Fvideo%2F" + df['id']

In [57]:
# df.to_csv("storedData.csv")

Code below downloads video with watermark. Will need to edit for a loop.


In [43]:
vids_to_download = df['videoUrl']

7     /getDownloadVideoWithWatermark?video_url=https...
2     /getDownloadVideoWithWatermark?video_url=https...
4     /getDownloadVideoWithWatermark?video_url=https...
9     /getDownloadVideoWithWatermark?video_url=https...
3     /getDownloadVideoWithWatermark?video_url=https...
8     /getDownloadVideoWithWatermark?video_url=https...
6     /getDownloadVideoWithWatermark?video_url=https...
1     /getDownloadVideoWithWatermark?video_url=https...
11    /getDownloadVideoWithWatermark?video_url=https...
0     /getDownloadVideoWithWatermark?video_url=https...
5     /getDownloadVideoWithWatermark?video_url=https...
10    /getDownloadVideoWithWatermark?video_url=https...
Name: videoUrl, dtype: object

In [44]:
import http.client
import urllib.request

conn = http.client.HTTPSConnection("tiktok82.p.rapidapi.com")

headers = {
    'X-RapidAPI-Key': "ee9f9a0a74mshdf11304da3f9e15p15be10jsna76036e001f3",
    'X-RapidAPI-Host': "tiktok82.p.rapidapi.com"
    }

loc = "./VidsForUpload/"
i = 0

# Loop through files and load the videos and combine into a list
while i < len(vids_to_download):
    filename = "vid_" + f'{i:02d}' + ".mp4"
    vid_url = vids_to_download[i]
    save_location = loc + filename
    conn.request("GET",
                 vid_url,
                 headers=headers)
    
    res = conn.getresponse()
    data = res.read()
    HQ_url = json.loads(data.decode("utf-8"))['video_url_HQ']
    url_link = HQ_url
    urllib.request.urlretrieve(url_link, save_location) 
    i = i + 1

Keep any files stored in selected directory which are of the usual type. This is done as within a sample of videos downloaded from TikTok, the majority were MP4 but one was MOV.

In [52]:
# Load packages for file observation
from os import listdir
from os.path import isfile, join

# Selected folder containing sample videos
mypath = "./VidsForUpload/"
usual_types = ("MP4", "MOV", "WMV", "AVI", "AVCHD",
               "FLV", "F4V", "SWF", "MKV", "WEBM", "HTML5",
              "mp4")

# Get the files then filter them so only those ending with above are kept.
video_files = [f for f in listdir(mypath) if isfile(join(mypath, f))]
video_files = [val for val in video_files if val.endswith(usual_types)]

The following appends all videos in the location together. This may change so that i can add directly from URL without saving the videos to my harddrive.

In [54]:
# Import moviepy to combine videos
from moviepy.editor import *

# Start the loop at 0 and create empty list the length of the files
i = 0
clips = list(range(len(video_files)))

# Loop through files and load the videos and combine into a list
while i < len(video_files):
    clips[i] = VideoFileClip(mypath + video_files[i])
    i = i + 1

# Combine all the videos together
combined_vids = concatenate_videoclips(clips)

In [56]:
# Write the video
combined_vids.write_videofile("video_for_upload.mp4")

chunk:   0%|▏                                                             | 26/8566 [00:00<00:33, 258.67it/s, now=None]

Moviepy - Building video video_for_upload.mp4.
MoviePy - Writing audio in video_for_uploadTEMP_MPY_wvf_snd.mp3


t:   0%|                                                                   | 3/23308 [00:00<13:33, 28.66it/s, now=None]

MoviePy - Done.
Moviepy - Writing video video_for_upload.mp4



                                                                                                                       

Moviepy - Done !
Moviepy - video ready video_for_upload.mp4
