### Pandas Apply Lambda

Always remember the Zen of Python!!!

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


In [2]:
import pandas as pd

In [3]:
df = pd.DataFrame({"user" : [1,2,3,4], "salary": [1000,1100,1200,1300], "age" : [18,19,22,30]})

In [4]:
df

Unnamed: 0,user,salary,age
0,1,1000,18
1,2,1100,19
2,3,1200,22
3,4,1300,30


In [5]:
def suma_uno(valor):
    return valor + 1

In [6]:
suma_uno(100)

101

In [7]:
df.apply(suma_uno)

Unnamed: 0,user,salary,age
0,2,1001,19
1,3,1101,20
2,4,1201,23
3,5,1301,31


In [8]:
lambda valor: valor + 1

<function __main__.<lambda>(valor)>

In [9]:
x = lambda valor: valor + 1

In [10]:
type(x)

function

In [11]:
x(22)

23

In [12]:
df.apply(x)

Unnamed: 0,user,salary,age
0,2,1001,19
1,3,1101,20
2,4,1201,23
3,5,1301,31


In [13]:
df.apply(lambda row: row['salary'] + 500, axis = 1)

0    1500
1    1600
2    1700
3    1800
dtype: int64

In [14]:
df['salary_corrected'] = df.apply(lambda row: row['salary'] + 500, axis = 1)

In [15]:
df

Unnamed: 0,user,salary,age,salary_corrected
0,1,1000,18,1500
1,2,1100,19,1600
2,3,1200,22,1700
3,4,1300,30,1800


In [16]:
df['worker_name'] = ['valeria', 'elena' , 'octavio' , 'victor']

In [17]:
df

Unnamed: 0,user,salary,age,salary_corrected,worker_name
0,1,1000,18,1500,valeria
1,2,1100,19,1600,elena
2,3,1200,22,1700,octavio
3,4,1300,30,1800,victor


In [18]:
df['worker_name_corrected'] = df.apply(lambda row: row ['worker_name'].capitalize(), axis = 1)  # axis = columns

In [19]:
df

Unnamed: 0,user,salary,age,salary_corrected,worker_name,worker_name_corrected
0,1,1000,18,1500,valeria,Valeria
1,2,1100,19,1600,elena,Elena
2,3,1200,22,1700,octavio,Octavio
3,4,1300,30,1800,victor,Victor


Movie data

In [20]:
movie = pd.read_csv('../data/input/IMDB-Movie-Data.csv')

In [21]:
movie

Unnamed: 0,Rank,Title,Genre,Description,Director,Actors,Year,Runtime (Minutes),Rating,Votes,Revenue (Millions),Metascore
0,1,Guardians of the Galaxy,"Action,Adventure,Sci-Fi",A group of intergalactic criminals are forced ...,James Gunn,"Chris Pratt, Vin Diesel, Bradley Cooper, Zoe S...",2014,121,8.1,757074,333.13,76.0
1,2,Prometheus,"Adventure,Mystery,Sci-Fi","Following clues to the origin of mankind, a te...",Ridley Scott,"Noomi Rapace, Logan Marshall-Green, Michael Fa...",2012,124,7.0,485820,126.46,65.0
2,3,Split,"Horror,Thriller",Three girls are kidnapped by a man with a diag...,M. Night Shyamalan,"James McAvoy, Anya Taylor-Joy, Haley Lu Richar...",2016,117,7.3,157606,138.12,62.0
3,4,Sing,"Animation,Comedy,Family","In a city of humanoid animals, a hustling thea...",Christophe Lourdelet,"Matthew McConaughey,Reese Witherspoon, Seth Ma...",2016,108,7.2,60545,270.32,59.0
4,5,Suicide Squad,"Action,Adventure,Fantasy",A secret government agency recruits some of th...,David Ayer,"Will Smith, Jared Leto, Margot Robbie, Viola D...",2016,123,6.2,393727,325.02,40.0
...,...,...,...,...,...,...,...,...,...,...,...,...
995,996,Secret in Their Eyes,"Crime,Drama,Mystery","A tight-knit team of rising investigators, alo...",Billy Ray,"Chiwetel Ejiofor, Nicole Kidman, Julia Roberts...",2015,111,6.2,27585,,45.0
996,997,Hostel: Part II,Horror,Three American college students studying abroa...,Eli Roth,"Lauren German, Heather Matarazzo, Bijou Philli...",2007,94,5.5,73152,17.54,46.0
997,998,Step Up 2: The Streets,"Drama,Music,Romance",Romantic sparks occur between two dance studen...,Jon M. Chu,"Robert Hoffman, Briana Evigan, Cassie Ventura,...",2008,98,6.2,70699,58.01,50.0
998,999,Search Party,"Adventure,Comedy",A pair of friends embark on a mission to reuni...,Scot Armstrong,"Adam Pally, T.J. Miller, Thomas Middleditch,Sh...",2014,93,5.6,4881,,22.0


Challenge 1

In [22]:
def categorize(value):
    if value < 1000:
        return 'categoria_1'
    elif value >= 1000 and value < 10000:
        return 'categoria_2'
    elif value >= 10000 and value < 100000:
        return 'categoria_3'
    elif value >= 100000 and value < 1000000:
        return 'categoria_4'
    else: 
        return 'categoria_5' 

In [23]:
movie.apply(lambda row: categorize(row['Votes']), axis = 1)

0      categoria_4
1      categoria_4
2      categoria_4
3      categoria_3
4      categoria_4
          ...     
995    categoria_3
996    categoria_3
997    categoria_3
998    categoria_2
999    categoria_3
Length: 1000, dtype: object

In [24]:
movie['bin'] = movie.apply(lambda row: categorize(row['Votes']), axis = 1)

Challenge 2

In [25]:
movie

Unnamed: 0,Rank,Title,Genre,Description,Director,Actors,Year,Runtime (Minutes),Rating,Votes,Revenue (Millions),Metascore,bin
0,1,Guardians of the Galaxy,"Action,Adventure,Sci-Fi",A group of intergalactic criminals are forced ...,James Gunn,"Chris Pratt, Vin Diesel, Bradley Cooper, Zoe S...",2014,121,8.1,757074,333.13,76.0,categoria_4
1,2,Prometheus,"Adventure,Mystery,Sci-Fi","Following clues to the origin of mankind, a te...",Ridley Scott,"Noomi Rapace, Logan Marshall-Green, Michael Fa...",2012,124,7.0,485820,126.46,65.0,categoria_4
2,3,Split,"Horror,Thriller",Three girls are kidnapped by a man with a diag...,M. Night Shyamalan,"James McAvoy, Anya Taylor-Joy, Haley Lu Richar...",2016,117,7.3,157606,138.12,62.0,categoria_4
3,4,Sing,"Animation,Comedy,Family","In a city of humanoid animals, a hustling thea...",Christophe Lourdelet,"Matthew McConaughey,Reese Witherspoon, Seth Ma...",2016,108,7.2,60545,270.32,59.0,categoria_3
4,5,Suicide Squad,"Action,Adventure,Fantasy",A secret government agency recruits some of th...,David Ayer,"Will Smith, Jared Leto, Margot Robbie, Viola D...",2016,123,6.2,393727,325.02,40.0,categoria_4
...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,996,Secret in Their Eyes,"Crime,Drama,Mystery","A tight-knit team of rising investigators, alo...",Billy Ray,"Chiwetel Ejiofor, Nicole Kidman, Julia Roberts...",2015,111,6.2,27585,,45.0,categoria_3
996,997,Hostel: Part II,Horror,Three American college students studying abroa...,Eli Roth,"Lauren German, Heather Matarazzo, Bijou Philli...",2007,94,5.5,73152,17.54,46.0,categoria_3
997,998,Step Up 2: The Streets,"Drama,Music,Romance",Romantic sparks occur between two dance studen...,Jon M. Chu,"Robert Hoffman, Briana Evigan, Cassie Ventura,...",2008,98,6.2,70699,58.01,50.0,categoria_3
998,999,Search Party,"Adventure,Comedy",A pair of friends embark on a mission to reuni...,Scot Armstrong,"Adam Pally, T.J. Miller, Thomas Middleditch,Sh...",2014,93,5.6,4881,,22.0,categoria_2


In [26]:
movie['Revenue (Millions)'].min()

0.0

In [27]:
def revenue_per_minute(revenue, minutes):
    return revenue/minutes

In [28]:
revenue_per_minute(movie['Revenue (Millions)'], movie['Runtime (Minutes)'])

0      2.753140
1      1.019839
2      1.180513
3      2.502963
4      2.642439
         ...   
995         NaN
996    0.186596
997    0.591939
998         NaN
999    0.225747
Length: 1000, dtype: float64

In [29]:
movie.apply(lambda row: revenue_per_minute(row['Revenue (Millions)'], row['Runtime (Minutes)']), axis = 1)

0      2.753140
1      1.019839
2      1.180513
3      2.502963
4      2.642439
         ...   
995         NaN
996    0.186596
997    0.591939
998         NaN
999    0.225747
Length: 1000, dtype: float64

In [30]:
movie['Revenue_por_minute'] = movie.apply(lambda row: revenue_per_minute(row['Revenue (Millions)'], row['Runtime (Minutes)']), axis = 1)

In [31]:
movie

Unnamed: 0,Rank,Title,Genre,Description,Director,Actors,Year,Runtime (Minutes),Rating,Votes,Revenue (Millions),Metascore,bin,Revenue_por_minute
0,1,Guardians of the Galaxy,"Action,Adventure,Sci-Fi",A group of intergalactic criminals are forced ...,James Gunn,"Chris Pratt, Vin Diesel, Bradley Cooper, Zoe S...",2014,121,8.1,757074,333.13,76.0,categoria_4,2.753140
1,2,Prometheus,"Adventure,Mystery,Sci-Fi","Following clues to the origin of mankind, a te...",Ridley Scott,"Noomi Rapace, Logan Marshall-Green, Michael Fa...",2012,124,7.0,485820,126.46,65.0,categoria_4,1.019839
2,3,Split,"Horror,Thriller",Three girls are kidnapped by a man with a diag...,M. Night Shyamalan,"James McAvoy, Anya Taylor-Joy, Haley Lu Richar...",2016,117,7.3,157606,138.12,62.0,categoria_4,1.180513
3,4,Sing,"Animation,Comedy,Family","In a city of humanoid animals, a hustling thea...",Christophe Lourdelet,"Matthew McConaughey,Reese Witherspoon, Seth Ma...",2016,108,7.2,60545,270.32,59.0,categoria_3,2.502963
4,5,Suicide Squad,"Action,Adventure,Fantasy",A secret government agency recruits some of th...,David Ayer,"Will Smith, Jared Leto, Margot Robbie, Viola D...",2016,123,6.2,393727,325.02,40.0,categoria_4,2.642439
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,996,Secret in Their Eyes,"Crime,Drama,Mystery","A tight-knit team of rising investigators, alo...",Billy Ray,"Chiwetel Ejiofor, Nicole Kidman, Julia Roberts...",2015,111,6.2,27585,,45.0,categoria_3,
996,997,Hostel: Part II,Horror,Three American college students studying abroa...,Eli Roth,"Lauren German, Heather Matarazzo, Bijou Philli...",2007,94,5.5,73152,17.54,46.0,categoria_3,0.186596
997,998,Step Up 2: The Streets,"Drama,Music,Romance",Romantic sparks occur between two dance studen...,Jon M. Chu,"Robert Hoffman, Briana Evigan, Cassie Ventura,...",2008,98,6.2,70699,58.01,50.0,categoria_3,0.591939
998,999,Search Party,"Adventure,Comedy",A pair of friends embark on a mission to reuni...,Scot Armstrong,"Adam Pally, T.J. Miller, Thomas Middleditch,Sh...",2014,93,5.6,4881,,22.0,categoria_2,


Challenge 3

In [32]:
def new_rating(genre,rating):
    if 'Thriller' in genre and 'Comedy' in genre: 
        return rating 
    elif 'Thriller' in genre: 
        return rating +1
    elif 'Comedy' in genre:
        return rating - 1 
    else:
        return rating

In [33]:
new_rating('Comedy, Thriller',1)

1

In [34]:
new_rating(movie['Genre'], movie['Rating'])

0      8.1
1      7.0
2      7.3
3      7.2
4      6.2
      ... 
995    6.2
996    5.5
997    6.2
998    5.6
999    5.3
Name: Rating, Length: 1000, dtype: float64

In [35]:
movie.apply(lambda row: new_rating(row['Genre'], row['Rating']), axis = 1)

0      8.1
1      7.0
2      8.3
3      6.2
4      6.2
      ... 
995    6.2
996    5.5
997    6.2
998    4.6
999    4.3
Length: 1000, dtype: float64

In [36]:
movie['New_rating'] = movie.apply(lambda row: new_rating(row['Genre'], row['Rating']), axis = 1)

In [37]:
movie

Unnamed: 0,Rank,Title,Genre,Description,Director,Actors,Year,Runtime (Minutes),Rating,Votes,Revenue (Millions),Metascore,bin,Revenue_por_minute,New_rating
0,1,Guardians of the Galaxy,"Action,Adventure,Sci-Fi",A group of intergalactic criminals are forced ...,James Gunn,"Chris Pratt, Vin Diesel, Bradley Cooper, Zoe S...",2014,121,8.1,757074,333.13,76.0,categoria_4,2.753140,8.1
1,2,Prometheus,"Adventure,Mystery,Sci-Fi","Following clues to the origin of mankind, a te...",Ridley Scott,"Noomi Rapace, Logan Marshall-Green, Michael Fa...",2012,124,7.0,485820,126.46,65.0,categoria_4,1.019839,7.0
2,3,Split,"Horror,Thriller",Three girls are kidnapped by a man with a diag...,M. Night Shyamalan,"James McAvoy, Anya Taylor-Joy, Haley Lu Richar...",2016,117,7.3,157606,138.12,62.0,categoria_4,1.180513,8.3
3,4,Sing,"Animation,Comedy,Family","In a city of humanoid animals, a hustling thea...",Christophe Lourdelet,"Matthew McConaughey,Reese Witherspoon, Seth Ma...",2016,108,7.2,60545,270.32,59.0,categoria_3,2.502963,6.2
4,5,Suicide Squad,"Action,Adventure,Fantasy",A secret government agency recruits some of th...,David Ayer,"Will Smith, Jared Leto, Margot Robbie, Viola D...",2016,123,6.2,393727,325.02,40.0,categoria_4,2.642439,6.2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,996,Secret in Their Eyes,"Crime,Drama,Mystery","A tight-knit team of rising investigators, alo...",Billy Ray,"Chiwetel Ejiofor, Nicole Kidman, Julia Roberts...",2015,111,6.2,27585,,45.0,categoria_3,,6.2
996,997,Hostel: Part II,Horror,Three American college students studying abroa...,Eli Roth,"Lauren German, Heather Matarazzo, Bijou Philli...",2007,94,5.5,73152,17.54,46.0,categoria_3,0.186596,5.5
997,998,Step Up 2: The Streets,"Drama,Music,Romance",Romantic sparks occur between two dance studen...,Jon M. Chu,"Robert Hoffman, Briana Evigan, Cassie Ventura,...",2008,98,6.2,70699,58.01,50.0,categoria_3,0.591939,6.2
998,999,Search Party,"Adventure,Comedy",A pair of friends embark on a mission to reuni...,Scot Armstrong,"Adam Pally, T.J. Miller, Thomas Middleditch,Sh...",2014,93,5.6,4881,,22.0,categoria_2,,4.6


Challenge 4

In [44]:
movie.columns

Index(['Rank', 'Title', 'Genre', 'Description', 'Director', 'Actors', 'Year',
       'Runtime (Minutes)', 'Rating', 'Votes', 'Revenue (Millions)',
       'Metascore', 'bin', 'Revenue_por_minute', 'New_rating'],
      dtype='object')

In [45]:
def es_primo(n):
    """Devuelve True si n es un número primo, False en caso contrario"""
    if n <= 1:
        return False
    if n <= 3:
        return True
    if n % 2 == 0 or n % 3 == 0:
        return False
    i = 5
    while i * i <= n:
        if n % i == 0 or n % (i + 2) == 0:
            return False
        i += 6
    return True

In [46]:
def ascii_sum(title):
    """Devuelve la suma de los valores ASCII de cada carácter en el título"""
    return sum(ord(char) for char in title)

In [47]:
def entero_suma_ascii_div_votos(title, votes):
    """Devuelve la parte entera de la suma de los valores ASCII del título dividida por el número de votos"""
    if votes == 0:
        return None  # Evitar división por cero
    return int(ascii_sum(title) / votes)

In [48]:
movie['Es_primo'] = movie.apply(lambda row: es_primo(entero_suma_ascii_div_votos(row['Title'], row['Votes'])), axis=1)

# Mostrar los primeros resultados para verificar
print(movie[['Title', 'Votes', 'Es_primo']].head())

                     Title   Votes  Es_primo
0  Guardians of the Galaxy  757074     False
1               Prometheus  485820     False
2                    Split  157606     False
3                     Sing   60545     False
4            Suicide Squad  393727     False


Challenge 5

In [50]:
from sklearn.preprocessing import MinMaxScaler

# Normalizar las columnas 
scaler = MinMaxScaler()
movie[['Rating_norm', 'Votes_norm', 'Revenue_norm', 'Metascore_norm']] = scaler.fit_transform(movie[['Rating', 'Votes', 'Revenue (Millions)', 'Metascore']])

# Definir los pesos para cada columna
weights = {
    'Rating_norm': 0.4,
    'Votes_norm': 0.2,
    'Revenue_norm': 0.2,
    'Metascore_norm': 0.2
}

# Calcular el nuevo índice de ranking
movie['Ranking_Index'] = (movie['Rating_norm'] * weights['Rating_norm'] +
                       movie['Votes_norm'] * weights['Votes_norm'] +
                       movie['Revenue_norm'] * weights['Revenue_norm'] +
                       movie['Metascore_norm'] * weights['Metascore_norm'])

# Verificar si hay valores nulos en 'Ranking_Index'
if movie['Ranking_Index'].isnull().any():
    print("Hay valores nulos en 'Ranking_Index'. Estos se manejarán antes de la conversión a enteros.")

# Manejar valores nulos en 'Ranking_Index'
movie['Ranking_Index'].fillna(0, inplace=True)

# Ordenar el DataFrame por el nuevo índice de ranking
movie = movie.sort_values(by='Ranking_Index', ascending=False)

# Asignar nuevos rankings
movie['New_Rank'] = movie['Ranking_Index'].rank(method='dense', ascending=False).astype(int)

# Mostrar los resultados
print(movie[['Title', 'Rating', 'Votes', 'Revenue (Millions)', 'Metascore', 'Ranking_Index', 'New_Rank']])


Hay valores nulos en 'Ranking_Index'. Estos se manejarán antes de la conversión a enteros.
                                          Title  Rating    Votes  \
54                              The Dark Knight     9.0  1791916   
50   Star Wars: Episode VII - The Force Awakens     8.1   661608   
80                                    Inception     8.8  1583625   
87                                       Avatar     7.8   935408   
124                       The Dark Knight Rises     8.5  1222645   
..                                          ...     ...      ...   
445                                 Silent Hill     6.6   184152   
463                              Predestination     7.5   187760   
477                                         Pet     5.7     8404   
478                              Paint It Black     8.3       61   
998                                Search Party     5.6     4881   

     Revenue (Millions)  Metascore  Ranking_Index  New_Rank  
54               533.32       

Bonus

In [52]:
import hashlib

# Definir función para calcular hash SHA256 y sumar valores numéricos
def sha256_sum(description):
    hash_object = hashlib.sha256(description.encode())
    hash_hex = hash_object.hexdigest()
    return sum(int(char, 16) for char in hash_hex if char.isnumeric())

# Aplicar la función a cada descripción
movie['SHA256_Sum'] = movie['Description'].apply(sha256_sum)

# Verificar si la suma está entre los ingresos y los votos
movie['Hidden_Pattern'] = movie.apply(lambda row: row['Revenue (Millions)'] <= row['SHA256_Sum'] <= row['Votes'], axis=1)

# Mostrar los resultados
movie[['Title', 'Description', 'SHA256_Sum', 'Revenue (Millions)', 'Votes', 'Hidden_Pattern']]


Unnamed: 0,Title,Description,SHA256_Sum,Revenue (Millions),Votes,Hidden_Pattern
54,The Dark Knight,When the menace known as the Joker wreaks havo...,187,533.32,1791916,False
50,Star Wars: Episode VII - The Force Awakens,Three decades after the defeat of the Galactic...,193,936.63,661608,False
80,Inception,"A thief, who steals corporate secrets through ...",173,292.57,1583625,False
87,Avatar,A paraplegic marine dispatched to the moon Pan...,166,760.51,935408,False
124,The Dark Knight Rises,Eight years after the Joker's reign of anarchy...,179,448.13,1222645,False
...,...,...,...,...,...,...
445,Silent Hill,"A woman, Rose, goes in search for her adopted ...",129,46.98,184152,True
463,Predestination,"For his final assignment, a top temporal agent...",167,,187760,False
477,Pet,A psychological thriller about a man who bumps...,183,,8404,False
478,Paint It Black,A young woman attempts to deal with the death ...,166,,61,False
