## Library Installation

Ensure that the necessary libraries are installed before running the notebook.

In [1]:
# Import necessary libraries
%pip install openpyxl
from openpyxl import load_workbook
import pandas as pd
import os
import re
import itertools

print("Skeleton setup complete!")


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.1.2[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.
Skeleton setup complete!


## Variable Declaration

Set the variables for file paths, sheet names, and other configurations. Update these variables for each specific project.

In [2]:
# Path to the Excel file (change this for each project)
excel_file_path = '/workspaces/Finetwork-Automation/retention/CARGA SERVICIOS.xlsx'
csv_file_path = '/workspaces/Finetwork-Automation/inbound/Informe de métricas históricas.csv'

# Range to read (change this for each project)
start_row = 5
end_row = 30
usecols = 'A:B'

print("Variables defined correctly!")

Variables defined correctly!


## Extract Data from all six Sheets

Extract data from all three Sheets within the specified range and convert it directly to a DataFrame.

In [3]:
def load_sheet_as_dataframe(file_path, sheet_name, start_row, end_row, usecols):
    # Load data from the specified sheet and range into a DataFrame
    df = pd.read_excel(file_path, sheet_name=sheet_name, usecols=usecols, skiprows=start_row-1, nrows=end_row-start_row+1)
    print(f"Data from '{sheet_name}' sheet loaded successfully.")
    return df

# Extract data from 'Active' sheet
ct_porta_df = load_sheet_as_dataframe(excel_file_path, 'CT PORTA', start_row, end_row, usecols)

# Display the DataFrames
print("CT PORTA DataFrame:")
display(ct_porta_df.head())

Data from 'CT PORTA' sheet loaded successfully.
CT PORTA DataFrame:


Unnamed: 0,Etiquetas de fila,Cuenta de Agente Respuesta N1
0,Carmen Romero,2
1,Irene Mateos,1
2,Pedro Manzanero,2
3,Total general,5


## Load Agents List

Load the list of all agents from the "Agents" sheet.

In [4]:
# Load the list of agents
agents_df = pd.read_excel(excel_file_path, sheet_name='AGENTES PORTA', usecols='A')
agents_list = agents_df.iloc[:, 0].tolist()
print("Agents list loaded successfully!")
print(agents_list)

Agents list loaded successfully!
['Maria Jose  Moreno', 'Irene Mateos', 'Maria Jesus Bruno', 'Carmen Romero', 'Pedro Manzanero', 'Virginia Aragon', 'Tamara Conde', 'Rosa Vilches', 'Manuel Cabra']


## Verify and Complete Data

Verify that all agents are present in each DataFrame. If an agent is missing, add a row with zeros for that agent.

In [5]:
def ensure_all_agents(df, agents_list):
    """
    Ensure all agents are present in the DataFrame. Add missing agents with zero values and remove agents not in the list.
    
    Parameters:
    df (pd.DataFrame): The DataFrame to check and update.
    agents_list (list): The list of all agents.
    
    Returns:
    pd.DataFrame: The updated DataFrame with all agents.
    """
    # Get the list of agents in the DataFrame
    existing_agents = df.iloc[:, 0].tolist()
    
    # Find missing agents
    missing_agents = [agent for agent in agents_list if agent not in existing_agents]
    
    # Add rows for missing agents with zero values
    for agent in missing_agents:
        zero_row = pd.DataFrame([[agent] + [0] * (df.shape[1] - 1)], columns=df.columns)
        df = pd.concat([df, zero_row], ignore_index=True)
    
    # Remove agents not in the agents list
    df = df[df.iloc[:, 0].isin(agents_list)]
    
    print(f"Added {len(missing_agents)} missing agents and removed {len(existing_agents) - len(df)} agents not in the list.")
    return df

# Apply the function to each DataFrame
ct_porta_df = ensure_all_agents(ct_porta_df, agents_list)

# Display the updated DataFrames
print("CT PORTA DataFrame after ensuring all agents:")
display(ct_porta_df.head(15))

Added 6 missing agents and removed -5 agents not in the list.
CT PORTA DataFrame after ensuring all agents:


Unnamed: 0,Etiquetas de fila,Cuenta de Agente Respuesta N1
0,Carmen Romero,2
1,Irene Mateos,1
2,Pedro Manzanero,2
4,Maria Jose Moreno,0
5,Maria Jesus Bruno,0
6,Virginia Aragon,0
7,Tamara Conde,0
8,Rosa Vilches,0
9,Manuel Cabra,0


## Assign Values to Emails

Assign numerical values to each email and add them as a new column in the DataFrames.

In [9]:
# Dictionary mapping emails to their respective values
email_values = {
    'Maria Jose Moreno': 55,
    'Irene Mateos': 2,
    'Maria Jesus Bruno': 3,
    'Carmen Romero': 4,
    'Pedro Manzanero': 5,
    'Virginia Aragon': 6,
    'Tamara Conde': 7,
    'Rosa Vilches': 8,
    'Manuel Cabra': 9
}

# Add a new column to each DataFrame with the email values
def add_email_values(df, email_values):
    df['email_value'] = df.iloc[:, 0].map(email_values)
    return df

# Apply the function to each DataFrame
ct_porta_df = add_email_values(ct_porta_df, email_values)

# Display the updated DataFrames with the new 'email_value' column
print("CT PORTA DataFrame with email values:")
display(ct_porta_df.head(15))

CT PORTA DataFrame with email values:


Unnamed: 0,Etiquetas de fila,Cuenta de Agente Respuesta N1,email_value
0,Carmen Romero,2,4.0
1,Irene Mateos,1,2.0
2,Pedro Manzanero,2,5.0
4,Maria Jose Moreno,0,
5,Maria Jesus Bruno,0,3.0
6,Virginia Aragon,0,6.0
7,Tamara Conde,0,7.0
8,Rosa Vilches,0,8.0
9,Manuel Cabra,0,9.0
