<a href="https://colab.research.google.com/github/frasercrichton/data-investigation-conspiracy-aotearoa/blob/main/analysis/Pattern_of_Life.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Bad Leader

The far right Italian prime minister described the recent fires in Sicily as 'bad weather'. So what did this bad weather look like?


In [29]:
import pandas as pd
from deep_translator import GoogleTranslator


## Parse the Videos

- Extract:
  - Comments
  - Likes
- Translate any text

In [158]:
videos_df = pd.read_json('../data/source/video.json', convert_dates=['createTime'])
videos_df = videos_df.drop(columns=[
                                    'author',
                                    'challenges',
                                    'collected',
                                    'contents', 
                                    'digged', 
                                    'duetDisplay', 
                                    'forFriend',
                                    'itemCommentStatus', 
                                    'privateItem', 
                                    'secret', 
                                    'shareEnabled', 
                                    'stitchDisplay', 
                                    'officalItem',
                                    'originalItem',
                                    'duetEnabled',
                                    'stitchEnabled',
                                    ])
def translate_text(text):
    translation = GoogleTranslator(source='it', target='en').translate(text)
    print(translation) 
    return translation  
    # return text.upper()

def parse_desc(desc):
    return translate_text(desc)

def parse_stats(stats):
    return pd.Series([stats['collectCount'], stats['commentCount'], stats['diggCount'], stats['playCount'], stats['shareCount']])

def parse_text_extra(extra_text):
    if extra_text is None:
        return pd.Series([extra_text, extra_text])
    
    extra_text_as_json = pd.json_normalize(extra_text)    
    hashtags = ', '.join(extra_text_as_json['hashtagName'].values)
    return pd.Series([hashtags, translate_text(hashtags)])

def parse_warn_info(warn_info):
    if warn_info is None:
        return warn_info
    
    return warn_info[0]['text']

def parse_video(video):    
    return video['zoomCover']['960']

# desc: translate the video description into English
videos_df['desc_en'] = videos_df.apply(lambda row: parse_desc(row['desc']), axis=1)

# textExtra: extract a list of hashtags
videos_df[['textExtra', 'textExtra_en']] = videos_df.apply(lambda row: parse_text_extra(row['textExtra']), axis=1)

# stats: turn the like, comment, etc. counts into columns
videos_df[['collectCount', 'commentCount', 'diggCount', 'playCount', 'shareCount']] = videos_df.apply(lambda row: parse_stats(row['stats']), axis=1)
videos_df = videos_df.drop(columns=['stats'])

# warnInfo: extract any content warning text 
videos_df['warnInfo'] = videos_df.apply(lambda x: parse_warn_info(x['warnInfo']), axis=1)

# Videos - zoomCover: get the URL of the cover image
videos_df['coverImage'] = videos_df.apply(lambda row: parse_video(row['video']), axis=1)

videos_df


My interview this evening on "Cinque Minuti", on Rai 1.
We are giving the Nation a strategy that it hasn't had for years, a pride that it had forgotten and a stability that is the basis of any real change possible. This is just the beginning.
Thank you India, congratulations on the success of the #G20.
41 years after the brutal mafia attack which caused the death of the Carabinieri General Carlo Alberto Dalla Chiesa, his wife Emanuela Setti Carraro and the escort agent Domenico Russo, the commitment to eradicate all forms of organized crime. Our deepest thanks and respect go to General Dalla Chiesa, an example of integrity and courage, and to all the servants of the State who fell fighting to free Italy from the cancer of the mafia. Your fight is ours and we will never back down.
Minimum wage: press point after the meeting with the opposition.
Minimum wage: I will explain the critical issues and what we intend to do, also involving the opposition, to present a serious, shared proposal 

Unnamed: 0,createTime,desc,id,music,video,textExtra,warnInfo,desc_en,textExtra_en,collectCount,commentCount,diggCount,playCount,shareCount,coverImage
0,2023-09-13 18:57:03,"La mia intervista di questa sera a ""Cinque Min...",7278386520275373056,"{'authorName': 'Giorgia Meloni', 'coverLarge':...","{'bitrate': 1178262, 'bitrateInfo': [{'Bitrate...",,,"My interview this evening on ""Cinque Minuti"", ...",,634,2065,11600,325800,351,https://p16-sign-useast2a.tiktokcdn.com/tos-us...
1,2023-09-12 16:05:41,Stiamo dando alla Nazione una strategia che no...,7277971263958666240,"{'authorName': 'Giorgia Meloni', 'coverLarge':...","{'bitrate': 1021716, 'bitrateInfo': [{'Bitrate...",,,We are giving the Nation a strategy that it ha...,,938,5926,19100,537500,553,https://p16-sign-useast2a.tiktokcdn.com/tos-us...
2,2023-09-11 09:02:24,"Grazie India, complimenti per il successo del ...",7277491085071387648,"{'authorName': 'Giorgia Meloni', 'coverLarge':...","{'bitrate': 1374581, 'bitrateInfo': [{'Bitrate...",g20,,"Thank you India, congratulations on the succes...",g20,300,449,6849,131700,209,https://p16-sign-useast2a.tiktokcdn.com/tos-us...
3,2023-09-03 07:28:10,A 41 anni dal brutale attentato mafioso che ha...,7274498125660703744,"{'authorName': 'Giorgia Meloni', 'coverLarge':...","{'bitrate': 627490, 'bitrateInfo': [{'Bitrate'...",,,41 years after the brutal mafia attack which c...,,275,274,4846,140300,204,https://p16-sign-useast2a.tiktokcdn.com/tos-us...
4,2023-08-12 06:05:11,Salario minimo: punto stampa dopo l‚Äôincontro c...,7266312532665584640,"{'authorName': 'Giorgia Meloni', 'coverLarge':...","{'bitrate': 1077066, 'bitrateInfo': [{'Bitrate...",,,Minimum wage: press point after the meeting wi...,,2842,7137,52500,1800000,1386,https://p16-sign-useast2a.tiktokcdn.com/tos-us...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
199,2022-05-01 12:52:33,Grazie ragazzi!,7092750010739215360,"{'authorName': 'Giorgia Meloni', 'coverLarge':...","{'bitrate': 1200774, 'bitrateInfo': [{'Bitrate...",,,Thank you guys!,,479,157,4396,114900,459,https://p16-sign-va.tiktokcdn.com/tos-maliva-p...
200,2022-05-01 11:43:30,I patrioti italiani tracciano la rotta del pro...,7092732219587873792,"{'authorName': 'Giorgia Meloni', 'coverLarge':...","{'bitrate': 906909, 'bitrateInfo': [{'Bitrate'...",,,Italian patriots chart the course of the conse...,,38,48,1477,38100,35,https://p16-sign-va.tiktokcdn.com/tos-maliva-p...
201,2022-04-30 14:00:56,L'Europa si √® presentata all'appuntamento dell...,7092396552064453632,"{'authorName': 'Giorgia Meloni', 'coverLarge':...","{'bitrate': 360640, 'bitrateInfo': [{'Bitrate'...",,,Europe presented itself at the rendezvous of h...,,39,30,1341,34300,32,https://p16-sign-va.tiktokcdn.com/tos-maliva-p...
202,2022-04-30 07:15:20,La libert√†? Negli ultimi due anni √® stata sacr...,7092292016104606720,"{'authorName': 'Giorgia Meloni', 'coverLarge':...","{'bitrate': 828646, 'bitrateInfo': [{'Bitrate'...",energiadaliberare,,Freedom? In the last two years it has been sac...,energytoliberate,47,32,1142,31100,48,https://p16-sign-va.tiktokcdn.com/tos-maliva-p...


Write out the results to a CSV file. 

In [148]:
videos_df.to_json('../data/processed/videos-translated.json')
videos_df.to_csv('../data/processed/videos-translated.csv', sep=',')