# Project description

**Status:** Done. Need to add more comments & description. 

## READ before running the notebook
If you are using jupyter notebook and you have variable inspector ticked in your nbextensions config (check in edit drop down menu, last option), I highly recommand to untick it before running this notebook. As some variables of this notebook are pd.DataFrame of dozens of millions of rows, it slows down the whole notebook. 


## Description:
This notebook has for purpose to download [IMDB's public datasets](https://www.imdb.com/interfaces/) which contain 7 `.tsv` files. This table is the the original files not the new one. Need to be updated for after [4 Cleaning each files](#Cleaning-each-files): 


File name / feature name | Feature description
--- | --- 
**akas**  |  Contains the following information for titles:
titleId (string)  |  a tconst, an alphanumeric unique identifier of the title
ordering (integer)  |  a number to uniquely identify rows for a given titleId
title (string)  |  the localized title
region (string)  |  the region for this version of the title
language (string)  |  the language of the title
types (array)  |  Enumerated set of attributes for this alternative title. One or more of the following: "alternative", "dvd", "festival", "tv", "video", "working", "original", "imdbDisplay". New values may be added in the future without warning
attributes (array)  |  Additional terms to describe this alternative title, not enumerated
isOriginalTitle (boolean)  |  0: not original title; 1: original title
|
|
**basics**  |  Contains the following information for titles:
tconst (string)  |  alphanumeric unique identifier of the title
titleType (string)  |  the type/format of the title (e.g. movie, short, tvseries, tvepisode, video, etc)
primaryTitle (string)  |  the more popular title / the title used by the filmmakers on promotional materials at the point of release
originalTitle (string)  |  original title, in the original language
isAdult (boolean)  |  0: non-adult title; 1: adult title
startYear (YYYY)  |  represents the release year of a title. In the case of TV Series, it is the series start year
endYear (YYYY)  |  TV Series end year. for all other title types
runtimeMinutes  |  primary runtime of the title, in minutes
genres (string array)  |  includes up to three genres associated with the title
|
|
**crew**  |  Contains the director and writer information for all the titles in IMDb. Fields include:
tconst (string)  |  alphanumeric unique identifier of the title
directors (array of nconsts)  |  director(s) of the given title
writers (array of nconsts)  |  writer(s) of the given title
|
|
**episode**  |  Contains the tv episode information. Fields include:
tconst (string)  |  alphanumeric identifier of episode
parentTconst (string)  |  alphanumeric identifier of the parent TV Series
seasonNumber (integer)  |  season number the episode belongs to
episodeNumber (integer)  |  episode number of the tconst in the TV series
|
|
**principals**  |  Contains the principal cast/crew for titles
tconst (string)  |  alphanumeric unique identifier of the title
ordering (integer)  |  a number to uniquely identify rows for a given titleId
nconst (string)  |  alphanumeric unique identifier of the name/person
category (string)  |  the category of job that person was in
job (string)  |  the specific job title if applicable, else 
characters (string)  |  the name of the character played if applicable, else 
|
|
**ratings**  |  Contains the IMDb rating and votes information for titles
tconst (string)  |  alphanumeric unique identifier of the title
averageRating  |  weighted average of all the individual user ratings
numVotes  |  number of votes the title has received
|
|
**name**  |  Contains the following information for names:
nconst (string)  |  alphanumeric unique identifier of the name/person
primaryName (string)  |  name by which the person is most often credited
birthYear  |  in YYYY format
deathYear  |  in YYYY format if applicable, else 
primaryProfession (array of strings)  |  the top-3 professions of the person
knownForTitles (array of tconsts)  |  titles the person is known for






**Run time on Macbook Pro 2017, i5 3.1 GHz 2 cores, 8GB of ram:** ~40mn

**Macbook Air M1 🥺BG of ram:** 🥺

<hr style="height:.9px;border:none;color:#333;background-color:#333;" />
<hr style="height:.9px;border:none;color:#333;background-color:#333;" />

# Get Imdb files links


## Import libraries 

* Requests to get the page 
* Beautiful Soup to get the content of that page
* Os and Shutil for file management
* Patoolib to decompress .gz archives
* Time To know how long it takes to excecute the script 
* Termcolor for unnecessary beautiful colored print statements 
* Re for string manipulation
* Caffeine for display always on

In [1]:
from bs4 import BeautifulSoup as bs
import requests

import os
import shutil
import patoolib as patoo
import re

import pandas as pd
import numpy as np

import time

from termcolor import colored

import caffeine
def on(): # Shortcut def to let the display on while runing long code. 
    caffeine.on(display=False)

## Scrape files links

You know when you click on a link and it doesn't open a new page but download a file? That is what's happening my friend, we're getting those links!

In [2]:
start = time.time()

# Link where to find the datasets 
url = "https://datasets.imdbws.com/"
links = []

# Get the web page
page = requests.get(url)

# Get the page's html 
soup = bs(page.content, "html.parser")

# Get all text as href tag
for href in soup.find_all("a"):
    links.append(href["href"])
    
links

['http://www.imdb.com/interfaces/',
 'https://datasets.imdbws.com/name.basics.tsv.gz',
 'https://datasets.imdbws.com/title.akas.tsv.gz',
 'https://datasets.imdbws.com/title.basics.tsv.gz',
 'https://datasets.imdbws.com/title.crew.tsv.gz',
 'https://datasets.imdbws.com/title.episode.tsv.gz',
 'https://datasets.imdbws.com/title.principals.tsv.gz',
 'https://datasets.imdbws.com/title.ratings.tsv.gz']

## Store them

In [3]:
down_links = [x for x in links if "tsv" in x]
metadata_link = list(set(links) - (set(down_links)))[0]

metadata_link

'http://www.imdb.com/interfaces/'

In [4]:
down_links

['https://datasets.imdbws.com/name.basics.tsv.gz',
 'https://datasets.imdbws.com/title.akas.tsv.gz',
 'https://datasets.imdbws.com/title.basics.tsv.gz',
 'https://datasets.imdbws.com/title.crew.tsv.gz',
 'https://datasets.imdbws.com/title.episode.tsv.gz',
 'https://datasets.imdbws.com/title.principals.tsv.gz',
 'https://datasets.imdbws.com/title.ratings.tsv.gz']

<hr style="height:.9px;border:none;color:#333;background-color:#333;" />
<hr style="height:.9px;border:none;color:#333;background-color:#333;" />


# Download files

## Create new directory to store them

In [5]:
directory = "imdb_raw_data"

if not os.path.exists(directory):
    os.makedirs(directory)

## Download the files

In [6]:

print('Download Starting...')
start = time.time()

for url in down_links:
    start2 = time.time()

    # Download files
    r = requests.get(url)
    filename = url.split('/')[-1].replace(".", "_", 1)
    print(filename)

    with open(filename,'wb') as output_file:
        output_file.write(r.content)


    # Decompress file
    patoo.extract_archive(filename, outdir="")

    # delete compressed file
    os.remove(filename)

    # Move file to the right folder
    filename = filename.replace(".gz", "")
    shutil.move(filename, directory)


    index = down_links.index(url) + 1
    print(colored(f"{filename} done, {len(down_links) - index} more - {round(time.time() - start2)}s", "red"))

print(colored(f"Download Completed!!! It took {round(time.time() - start)} seconds", "red", attrs=['bold']))

Download Starting...
name_basics.tsv.gz
patool: Extracting name_basics.tsv.gz ...
patool: running '/usr/bin/gzip' -c -d -- 'name_basics.tsv.gz' > 'name_basics.tsv'
patool:     with shell='True'
patool: ... name_basics.tsv.gz extracted to `'.
[31mname_basics.tsv done, 6 more - 11s[0m
title_akas.tsv.gz
patool: Extracting title_akas.tsv.gz ...
patool: running '/usr/bin/gzip' -c -d -- 'title_akas.tsv.gz' > 'title_akas.tsv'
patool:     with shell='True'
patool: ... title_akas.tsv.gz extracted to `'.
[31mtitle_akas.tsv done, 5 more - 13s[0m
title_basics.tsv.gz
patool: Extracting title_basics.tsv.gz ...
patool: running '/usr/bin/gzip' -c -d -- 'title_basics.tsv.gz' > 'title_basics.tsv'
patool:     with shell='True'
patool: ... title_basics.tsv.gz extracted to `'.
[31mtitle_basics.tsv done, 4 more - 7s[0m
title_crew.tsv.gz
patool: Extracting title_crew.tsv.gz ...
patool: running '/usr/bin/gzip' -c -d -- 'title_crew.tsv.gz' > 'title_crew.tsv'
patool:     with shell='True'
patool: ... titl

## Checking files

Here I just check that all the seven files that were links in "links" are in the directory in the right format

In [7]:
from os import listdir
from os.path import isfile, join

onlyfiles = [f for f in listdir(directory) if isfile(join(directory, f)) and ".tsv" in f]

for file in onlyfiles:
    print(file)

title_basics.tsv
title_ratings.tsv
title_crew.tsv
name_basics.tsv
title_akas.tsv
title_episode.tsv
title_principals.tsv


<hr style="height:.9px;border:none;color:#333;background-color:#333;" />
<hr style="height:.9px;border:none;color:#333;background-color:#333;" />



# Cleaning each files

## Basics

### Import file

Sometimes I have an Io error, no reson why. Just rerun it, it should work

In [8]:
df_basics = pd.read_csv(directory + "/title_basics.tsv", delimiter="\t", low_memory=False)


In [9]:
df_basics.columns = ["t_id", "type", "primary_title", "original_title", "for_adult", 
                           "start_year", "end_year", "runtime_mn", "genres"]

### Replacing missing values & title_id

* Missing avlues are represented by `\N` and I prefer to have missing values instead as `\n`
* Changing title_id from `tt0000001` to `1`, `tt000389` to `389`,  and so on.

In [10]:
def clean_df(df):
    for col in df.columns: 
        df[col] = df[col].replace(r"\N", np.nan)
        
        if "id" in col: 
            df[col] = df[col].apply(lambda x: int(re.sub("[^0-9]", "", x))) # Keeping only numbers from str
        
    return df


df_basics = clean_df(df_basics)
df_basics.head()


Unnamed: 0,t_id,type,primary_title,original_title,for_adult,start_year,end_year,runtime_mn,genres
0,1,short,Carmencita,Carmencita,0,1894,,1,"Documentary,Short"
1,2,short,Le clown et ses chiens,Le clown et ses chiens,0,1892,,5,"Animation,Short"
2,3,short,Pauvre Pierrot,Pauvre Pierrot,0,1892,,4,"Animation,Comedy,Romance"
3,4,short,Un bon bock,Un bon bock,0,1892,,12,"Animation,Short"
4,5,short,Blacksmith Scene,Blacksmith Scene,0,1893,,1,"Comedy,Short"


### Adding runtime to genres 

Some observations have (missing values for most columns) the value of genres in the runtimeMinutes column.

Here is an example:

In [11]:
df_basics[df_basics["runtime_mn"] == 'Reality-TV']


Unnamed: 0,t_id,type,primary_title,original_title,for_adult,start_year,end_year,runtime_mn,genres
1101591,10233364,tvEpisode,Rolling in the Deep Dish\tRolling in the Deep ...,0,2019,,,Reality-TV,
2345856,12415330,tvEpisode,Anthony Davis High Brow Tank\tAnthony Davis Hi...,0,2017,,,Reality-TV,
5216475,3984412,tvEpisode,"I'm Not Going to Come Last, I'm Just Going to ...",0,2014,,,Reality-TV,


Even I will drop those observations later, I still want the data to be as clean as I can.

In [12]:
for genre in ['Reality-TV', 'Documentary', 'Talk-Show', 'Game-Show', 'Animation,Comedy,Family']:
    indexes = df_basics[df_basics["runtime_mn"] == genre].index
    
    for index in indexes: 
        df_basics.loc[index, "genres"] = genre

### Changing data types

In [13]:
df_basics["for_adult"] = df_basics["for_adult"].astype(float)
df_basics["start_year"] = df_basics["start_year"].astype(float)
df_basics["end_year"] = df_basics["end_year"].astype(float)

df_basics["runtime_mn"] = df_basics["runtime_mn"].replace({"Reality-TV": np.nan,
                                                     "Documentary": np.nan,
                                                     "Talk-Show": np.nan,
                                                     "Game-Show": np.nan,
                                                     "Reality-TV": np.nan,
                                                     "Animation,Comedy,Family": np.nan}).astype(float)


### Keeping only 2000-2021

This can be used for later, to keep only movies from certain years. 

In [14]:
print(df_basics.shape)
#df_basics = df_tb[df_tb["startYear"].isin(list(range(2000, 2022)))].reset_index(drop=True)
print(df_basics.shape)


(7939695, 9)
(7939695, 9)


### Getting all genres 

#### Get all genres with their title id 

In [15]:
genre_title = []

for x in range(len(df_basics["genres"])):
    genres = df_basics.loc[x, "genres"]
    title_id = df_basics.loc[x, "t_id"]
    
    if type(genres) == str:
        for genre in genres.split(","): 
            genre_title.append([title_id, genre])
            
    if x % 100000 == 0: # As the cells run for a long time, it helps keeping track
        length = len(df_basics["genres"])
        print(f"{x:,d} / {length:,d}", end="\r") 
        

len(genre_title)

7,900,000 / 7,939,695

11991888

#### Creating genre dataframe

In [16]:
df_genres = pd.DataFrame(genre_title)
df_genres.head()

Unnamed: 0,0,1
0,1,Documentary
1,1,Short
2,2,Animation
3,2,Short
4,3,Animation


### Export both tables

Keeping the genre column with list of genre doesn't lead to anything so I'm going to create two new tables.
* genre that will contain each and individual genre
* title_id that will contain each title_id (one observation for each genre of each title)


In [17]:
df_genres.to_csv(directory + "/genres.csv", index=False)

# We don't need genre column in basics as we have a new df for it 
df_basics.drop("genres", axis=1).to_csv(directory + "/basics.csv", index=False) # Exported as csv for convenience 

# We also don't need that tsv file anymore
os.remove(directory + "/title_basics.tsv")

<hr style="height:.9px;border:none;color:#333;background-color:#333;" />

## Principals
### Import file

In [18]:
df_principals = pd.read_csv(directory + "/title_principals.tsv", delimiter="\t", low_memory=False)

df_principals.shape

(44923116, 6)

In [19]:
df_principals.columns = ['t_id', 'ordering', 'n_id', 'job_category', 'job_title', 'character_played']

### Replace missing values & id of id columns

In [20]:
df_principals = clean_df(df_principals)
df_principals.head()

Unnamed: 0,t_id,ordering,n_id,job_category,job_title,character_played
0,1,1,1588970,self,,"[""Self""]"
1,1,2,5690,director,,
2,1,3,374658,cinematographer,director of photography,
3,2,1,721526,director,,
4,2,2,1335271,composer,,


### Export title_principals 

Now that we have correct missing values and ids, let's export it!

In [21]:
df_principals.to_csv(directory + "/principals.csv", index=False)
os.remove(directory + "/title_principals.tsv")

<hr style="height:.9px;border:none;color:#333;background-color:#333;" />

## Ratings

Finally a light file... The lightest actually!

### Import file

In [22]:
df_ratings = pd.read_csv(directory + "/title_ratings.tsv", delimiter="\t", low_memory=False)

df_ratings.shape

(1156035, 3)

In [23]:
df_ratings.columns = ['t_id', 'rating', 'votes']

### Replacing missing values & id of id_columns


In [24]:
df_ratings = clean_df(df_ratings)
df_ratings.head()


Unnamed: 0,t_id,rating,votes
0,1,5.7,1702
1,2,6.1,210
2,3,6.5,1462
3,4,6.2,123
4,5,6.2,2263


### Export Ratings 

Now that we have correct missing values and ids, let's export it!

In [25]:
df_ratings.to_csv(directory + "/ratings.csv", index=False)
os.remove(directory + "/title_ratings.tsv")

<hr style="height:.9px;border:none;color:#333;background-color:#333;" />

## Episode

### Import file

In [26]:
df_episode = pd.read_csv(directory + "/title_episode.tsv", delimiter="\t", low_memory=False)
df_episode.shape

(5792364, 4)

`parent_id` is the name of the TV-show where `t_id` is the title of the episode. For movies, `t_id` is the title of the movie.


In [27]:
df_episode.columns = ['t_id', 'parent_id', 'season', 'episode']

### Replacing missing values & id of id_columns


In [28]:
df_episode = clean_df(df_episode)
df_episode.head()

Unnamed: 0,t_id,parent_id,season,episode
0,41951,41038,1.0,9.0
1,42816,989125,1.0,17.0
2,42889,989125,,
3,43426,40051,3.0,42.0
4,43631,989125,2.0,16.0


### Export Episodes 

Now that we have correct missing values and ids, let's export it!

In [29]:
df_episode.to_csv(directory + "/episode.csv", index=False)
os.remove(directory + "/title_episode.tsv")

<hr style="height:.9px;border:none;color:#333;background-color:#333;" />

## Crew

### Import file

In [30]:
df_crew = pd.read_csv(directory + "/title_crew.tsv", delimiter="\t", low_memory=False)
df_crew.shape

(7939695, 3)

In [31]:
df_crew.columns = ['t_id', 'directors', 'writers']

### Replacing missing values & id of id_columns


In [32]:
df_crew = clean_df(df_crew)
df_crew.head()

Unnamed: 0,t_id,directors,writers
0,1,nm0005690,
1,2,nm0721526,
2,3,nm0721526,
3,4,nm0721526,
4,5,nm0005690,


### Flat out the table 

Some ids have multiple directors and/or writers, I want each row to have one value only. I'm basically redoing what I did in `4.1.6`

In [33]:
on() # To keep th 

directors_lst = []
writers_lst = []

length = len(df_crew["t_id"])
for index in range(length):
    t_id = df_crew.loc[index, "t_id"]
    directors = df_crew.loc[index, "directors"]
    writers = df_crew.loc[index, "writers"]
    
    if type(directors) == str:
        for director in directors.split(","):
            directors_lst.append([t_id, director])
    
    if type(writers) == str:
        for writer in writers.split(","):
            writers_lst.append([t_id, director])
    
    
    if index % 100000 == 0:
        print(f"{index:,d} / {length:,d} - {len(directors_lst):,d} / {len(writers_lst):,d}" + " "*20, end="\r")
    

7,900,000 / 7,939,695 - 5,978,662 / 9,094,823                    

In [34]:
df_directors = pd.DataFrame(directors_lst, columns = ["t_id", "directors"])
df_writers = pd.DataFrame(writers_lst, columns = ["t_id", "writers"])

df_crew = df_directors.merge(df_writers, how="outer")
df_crew.shape

(20459228, 3)

### Export Crew 

Now that we have correct missing values and ids, let's export it!

In [35]:
df_crew.to_csv(directory + "/crew.csv", index=False)
os.remove(directory + "/title_crew.tsv")

<hr style="height:.9px;border:none;color:#333;background-color:#333;" />

## Akas

### Import file

In [36]:
df_akas = pd.read_csv(directory + "/title_akas.tsv", delimiter="\t", low_memory=False)

df_akas.shape

(26400094, 8)

In [37]:
df_akas.columns = ['t_id', 'ordering', 'title', 'region', 'language', 
                   'types', 'attributes', 'isOriginalTitle']

### Replacing missing values & id of id_columns


In [38]:
df_akas = clean_df(df_akas)
df_akas.head()

Unnamed: 0,t_id,ordering,title,region,language,types,attributes,isOriginalTitle
0,1,1,Карменсіта,UA,,imdbDisplay,,0
1,1,2,Carmencita,DE,,,literal title,0
2,1,3,Carmencita - spanyol tánc,HU,,imdbDisplay,,0
3,1,4,Καρμενσίτα,GR,,imdbDisplay,,0
4,1,5,Карменсита,RU,,imdbDisplay,,0


### Export Akas 

Now that we have correct missing values and ids, let's export it!

In [39]:
df_akas.to_csv(directory + "/akas.csv", index=False)
os.remove(directory + "/title_akas.tsv")

<hr style="height:.9px;border:none;color:#333;background-color:#333;" />

## Names

### Import file

In [40]:
df_names = pd.read_csv(directory + "/name_basics.tsv", delimiter="\t", low_memory=False)

df_names.shape

(10958740, 6)

In [41]:
df_names.columns = ['n_id', 'name', 'birth', 
                          'death', 'profession','known_for']


### Replacing missing values & id of id_columns


In [42]:
df_names = clean_df(df_names)
df_names.head()

Unnamed: 0,n_id,name,birth,death,profession,known_for
0,1,Fred Astaire,1899,1987.0,"soundtrack,actor,miscellaneous","tt0031983,tt0050419,tt0053137,tt0072308"
1,2,Lauren Bacall,1924,2014.0,"actress,soundtrack","tt0071877,tt0038355,tt0117057,tt0037382"
2,3,Brigitte Bardot,1934,,"actress,soundtrack,music_department","tt0049189,tt0056404,tt0057345,tt0054452"
3,4,John Belushi,1949,1982.0,"actor,soundtrack,writer","tt0080455,tt0078723,tt0077975,tt0072562"
4,5,Ingmar Bergman,1918,2007.0,"writer,director,actor","tt0060827,tt0050986,tt0083922,tt0050976"


In [43]:
# Calling all movies where Brigitte Bardot played
df_principals[df_principals["n_id"] == 3]

Unnamed: 0,t_id,ordering,n_id,job_category,job_title,character_played
364765,44881,1,3,actress,,"[""Manina""]"
376459,46200,3,3,actress,,"[""Domino""]"
388875,47607,1,3,actress,,"[""Anna Schumann""]"
392444,48001,3,3,actress,,"[""Hélène Colbert""]"
393367,48103,2,3,actress,,"[""Sophie Dimater""]"
...,...,...,...,...,...,...
41940357,8881518,2,3,archive_footage,,"[""Self""]"
42053126,8920000,3,3,self,,"[""Self""]"
43180749,9310388,1,3,archive_footage,,"[""Self""]"
43247549,9332740,2,3,self,,"[""Self""]"


### Known_for flat out
#### Get each movie for which the person is known for


Known_for don't represent all the movie of each person so we need to flat it out

In [44]:
data = []

for x in range(len(df_names["n_id"])):
    n_id = df_names.loc[x, "n_id"]
    name = df_names.loc[x, "name"]
    known_for = df_names.loc[x, "known_for"]
    
    if type(known_for) == str:
        for movie in known_for.split(","):
            data.append([n_id, name, movie])
            
    if x % 100000 == 0:
        length = len(df_names["n_id"])
        print(f"{x:,d} / {length:,d} - {len(data):,d}", end="\r")            
    

10,900,000 / 10,958,740 - 17,043,940

#### Create a df for known_for

In [45]:
df_known_for = pd.DataFrame(data, columns = ["n_id", "name", "known_for"])
df_known_for.shape

(17100801, 3)

In [46]:
df_known_for.head()

Unnamed: 0,n_id,name,known_for
0,1,Fred Astaire,tt0031983
1,1,Fred Astaire,tt0050419
2,1,Fred Astaire,tt0053137
3,1,Fred Astaire,tt0072308
4,2,Lauren Bacall,tt0071877


### Export both tables: Akas and Known_for

Now that we have correct missing values and ids, let's export it!

In [47]:
df_known_for.to_csv(directory + "/known_for.csv", index=False)

# We don't need genre column in basics as we have a new df for it 
df_names.drop("known_for", axis=1).to_csv(directory + "/names.csv", index=False)
os.remove(directory + "/name_basics.tsv")

In [48]:
# This is just for me when I run the script fully and I want to know when it's done. From my bed. 
on()
diff = round((time.time() - start)/60)
print(f"It took {diff} minutes")
print("✅"*1000)

It took 40 minutes
✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅✅