# Inbound Notebook

This notebook is designed to semi-automate the reporting process for the Inbound team. It will streamline data extraction, transformation, and loading into a pre-formatted Excel file.

## Manual Preparation

The first step involves manually preparing the data in Excel:

1. **Filter the Pivot Table:**
   - Apply filters to the pivot table to extract the following categories:
     - Active
     - Canceled
     - Pending Signature
     - Net

2. **Create Separate Sheets:**
   - For each category (Active, Canceled, Pending Signature, Net), create a separate sheet in the Excel file containing the filtered data.

3. **Save the Excel File:**
   - Save the prepared Excel file with a specific name, ensuring it contains the sheets with the filtered data.

4. **Upload the Excel File:**
   - Upload the prepared Excel file to the designated directory.

## Library Installation

Ensure that the necessary libraries are installed before running the notebook.

In [1]:
# Import necessary libraries
%pip install openpyxl
from openpyxl import load_workbook
import pandas as pd
import os
import re

print("Skeleton setup complete!")


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.1.2[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.
Skeleton setup complete!


## Variable Declaration

Set the variables for file paths, sheet names, and other configurations. Update these variables for each specific project.

In [2]:
# Path to the Excel file (change this for each project)
excel_file_path = '/workspaces/Finetwork-Automation/inbound/Metabase.xlsx'

# Sheet names for different categories
sheet_active = 'Active'
sheet_canceled = 'Canceled'
sheet_pending = 'Pending Signature'
sheet_net = 'Net'

# Range to read (change this for each project)
start_row = 8
end_row = 65
usecols = 'A:AF'

print("Variables defined correctly!")

Variables defined correctly!


## Extract Data from 'Active' Sheet

Extract data from the "Active" sheet within the specified range and convert it directly to a DataFrame.

## Verify Columns in 'Active' Sheet

Verify the number of columns in the "Active" sheet to ensure the range is within bounds.

In [3]:
# Function to verify the number of columns
def verify_columns(file_path, sheet_name):
    workbook = load_workbook(filename=file_path, data_only=True)
    sheet = workbook[sheet_name]
    max_column = sheet.max_column
    return max_column

# Check the number of columns in the 'Active' sheet
max_column_active = verify_columns(excel_file_path, 'Active')
print(f"Max column in 'Active' sheet: {max_column_active}")

# Check if the number of columns matches the expected range
expected_columns = 32  # Columns from A to AF (inclusive)
if max_column_active < expected_columns:
    usecols = f"A:{chr(64+max_column_active)}"
    print(f"Adjusted usecols to: {usecols}")
else:
    print(f"Using default usecols: {usecols}")

Max column in 'Active' sheet: 7
Adjusted usecols to: A:G


## Extract Data from 'Active' Sheet

Extract data from the "Active" sheet within the specified range and convert it directly to a DataFrame.

In [4]:
def load_sheet_as_dataframe(file_path, sheet_name, start_row, end_row, usecols):
    # Load data from the specified sheet and range into a DataFrame
    df = pd.read_excel(file_path, sheet_name=sheet_name, usecols=usecols, skiprows=start_row-1, nrows=end_row-start_row+1)
    print(f"Data from '{sheet_name}' sheet loaded successfully.")
    return df

# Extract data from 'Active' sheet
active_df = load_sheet_as_dataframe(excel_file_path, 'Active', start_row, end_row, usecols)

Data from 'Active' sheet loaded successfully.


## Extract Data from 'Canceled' Sheet

Extract data from the "Canceled" sheet within the specified range and convert it directly to a DataFrame.

In [5]:
# Extract data from 'Canceled' sheet
canceled_df = load_sheet_as_dataframe(excel_file_path, 'Canceled', start_row, end_row, usecols)

Data from 'Canceled' sheet loaded successfully.


## Extract Data from 'Pending Signature' Sheet

Extract data from the "Pending Signature" sheet within the specified range and convert it directly to a DataFrame.

In [6]:
# Extract data from 'Pending Signature' sheet
pending_signature_df = load_sheet_as_dataframe(excel_file_path, 'Pending Signature', start_row, end_row, usecols)

Data from 'Pending Signature' sheet loaded successfully.


## Extract Data from 'Net' Sheet

Extract data from the "Net" sheet within the specified range and convert it directly to a DataFrame.

In [7]:
# Extract data from 'Net' sheet
net_df = load_sheet_as_dataframe(excel_file_path, 'Net', start_row, end_row, usecols)

Data from 'Net' sheet loaded successfully.


## Display DataFrames

Display the first few rows of each DataFrame to verify the data.

In [8]:
# Display the DataFrames
print("Active DataFrame:")
display(active_df.head(65))

print("Canceled DataFrame:")
display(canceled_df.head(65))

print("Pending Signature DataFrame:")
display(pending_signature_df.head(65))

print("Net DataFrame:")
display(net_df.head(65))

Active DataFrame:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
0,Inbound Telec.Orig.Sevilla,382.0,356.0,217.0,141.0,48.0,1144
1,albaaraujo@originaltelecom.es,11.0,8.0,,,,19
2,albertocanto@originaltelecom.es,9.0,9.0,,,,18
3,albertosanchez@originaltelecom.es,17.0,11.0,,,,28
4,anasanchez@originaltelecom.es,,,24.0,19.0,,43
5,antonio.reina@originaltelecom.es,11.0,11.0,,,5.0,27
6,azahara.garcia@originaltelecom.es,,,,17.0,,17
7,beatriz.gomez@originaltelecom.es,11.0,15.0,,,4.0,30
8,carolinafuentes@originaltelecom.es,8.0,8.0,,,3.0,19
9,cesar.arnaldo@originaltelecom.es,,,20.0,9.0,,29


Canceled DataFrame:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
0,Inbound Telec.Orig.Sevilla,79.0,44.0,23.0,19.0,7.0,172
1,albertocanto@originaltelecom.es,,5.0,,,,5
2,albertosanchez@originaltelecom.es,2.0,,,,,2
3,anasanchez@originaltelecom.es,,,2.0,,,2
4,antonio.reina@originaltelecom.es,,5.0,,,,5
5,azahara.garcia@originaltelecom.es,,,,3.0,,3
6,beatriz.gomez@originaltelecom.es,1.0,,,,,1
7,carolinafuentes@originaltelecom.es,,1.0,,,,1
8,cesar.arnaldo@originaltelecom.es,,,4.0,,,4
9,david.molero@originaltelecom.es,3.0,1.0,,,,4


Pending Signature DataFrame:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
0,Inbound Telec.Orig.Sevilla,30.0,27.0,6.0,10.0,9.0,82
1,albaaraujo@originaltelecom.es,,1.0,,,,1
2,albertocanto@originaltelecom.es,4.0,,,,,4
3,antonio.reina@originaltelecom.es,,1.0,,,,1
4,azahara.garcia@originaltelecom.es,,,,1.0,,1
5,carolinafuentes@originaltelecom.es,2.0,,,,,2
6,cesar.arnaldo@originaltelecom.es,,,2.0,3.0,,5
7,david.molero@originaltelecom.es,,2.0,,,,2
8,elenaborrero@originaltelecom.es,1.0,2.0,,,,3
9,enrique.miranda@originaltelecom.es,1.0,,,,,1


Net DataFrame:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
0,Inbound Telec.Orig.Sevilla,251.0,152.0,63.0,55.0,21.0,542
1,albaaraujo@originaltelecom.es,7.0,3.0,,,,10
2,albertocanto@originaltelecom.es,6.0,5.0,,,,11
3,albertosanchez@originaltelecom.es,11.0,5.0,,,,16
4,anasanchez@originaltelecom.es,,,7.0,6.0,,13
5,antonio.reina@originaltelecom.es,7.0,5.0,,,2.0,14
6,azahara.garcia@originaltelecom.es,,,,7.0,,7
7,beatriz.gomez@originaltelecom.es,7.0,7.0,,,1.0,15
8,carolinafuentes@originaltelecom.es,7.0,3.0,,,1.0,11
9,cesar.arnaldo@originaltelecom.es,,,7.0,4.0,,11


## Replace NaN with 0

Replace all NaN values in the DataFrames with 0 to facilitate further transformations.

In [9]:
def replace_nan_with_zero(df):
    """
    Replace all NaN values in the DataFrame with 0.
    
    Parameters:
    df (pd.DataFrame): The DataFrame to process.
    
    Returns:
    pd.DataFrame: The processed DataFrame with NaN replaced by 0.
    """
    df = df.fillna(0)
    print("Replaced NaN with 0.")
    return df

## Apply Transformation

Apply the transformation to replace NaN values with 0 in each DataFrame.

In [10]:
# Apply the transformation
active_df = replace_nan_with_zero(active_df)
canceled_df = replace_nan_with_zero(canceled_df)
pending_signature_df = replace_nan_with_zero(pending_signature_df)
net_df = replace_nan_with_zero(net_df)

# Display the transformed DataFrames
print("Active DataFrame after replacing NaN:")
display(active_df.head(50))

print("Canceled DataFrame after replacing NaN:")
display(canceled_df.head(50))

print("Pending Signature DataFrame after replacing NaN:")
display(pending_signature_df.head(50))

print("Net DataFrame after replacing NaN:")
display(net_df.head(50))

Replaced NaN with 0.
Replaced NaN with 0.
Replaced NaN with 0.
Replaced NaN with 0.
Active DataFrame after replacing NaN:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
0,Inbound Telec.Orig.Sevilla,382.0,356.0,217.0,141.0,48.0,1144
1,albaaraujo@originaltelecom.es,11.0,8.0,0.0,0.0,0.0,19
2,albertocanto@originaltelecom.es,9.0,9.0,0.0,0.0,0.0,18
3,albertosanchez@originaltelecom.es,17.0,11.0,0.0,0.0,0.0,28
4,anasanchez@originaltelecom.es,0.0,0.0,24.0,19.0,0.0,43
5,antonio.reina@originaltelecom.es,11.0,11.0,0.0,0.0,5.0,27
6,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,17.0,0.0,17
7,beatriz.gomez@originaltelecom.es,11.0,15.0,0.0,0.0,4.0,30
8,carolinafuentes@originaltelecom.es,8.0,8.0,0.0,0.0,3.0,19
9,cesar.arnaldo@originaltelecom.es,0.0,0.0,20.0,9.0,0.0,29


Canceled DataFrame after replacing NaN:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
0,Inbound Telec.Orig.Sevilla,79.0,44.0,23.0,19.0,7.0,172
1,albertocanto@originaltelecom.es,0.0,5.0,0.0,0.0,0.0,5
2,albertosanchez@originaltelecom.es,2.0,0.0,0.0,0.0,0.0,2
3,anasanchez@originaltelecom.es,0.0,0.0,2.0,0.0,0.0,2
4,antonio.reina@originaltelecom.es,0.0,5.0,0.0,0.0,0.0,5
5,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,3.0,0.0,3
6,beatriz.gomez@originaltelecom.es,1.0,0.0,0.0,0.0,0.0,1
7,carolinafuentes@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1
8,cesar.arnaldo@originaltelecom.es,0.0,0.0,4.0,0.0,0.0,4
9,david.molero@originaltelecom.es,3.0,1.0,0.0,0.0,0.0,4


Pending Signature DataFrame after replacing NaN:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
0,Inbound Telec.Orig.Sevilla,30.0,27.0,6.0,10.0,9.0,82
1,albaaraujo@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1
2,albertocanto@originaltelecom.es,4.0,0.0,0.0,0.0,0.0,4
3,antonio.reina@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1
4,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,1.0,0.0,1
5,carolinafuentes@originaltelecom.es,2.0,0.0,0.0,0.0,0.0,2
6,cesar.arnaldo@originaltelecom.es,0.0,0.0,2.0,3.0,0.0,5
7,david.molero@originaltelecom.es,0.0,2.0,0.0,0.0,0.0,2
8,elenaborrero@originaltelecom.es,1.0,2.0,0.0,0.0,0.0,3
9,enrique.miranda@originaltelecom.es,1.0,0.0,0.0,0.0,0.0,1


Net DataFrame after replacing NaN:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
0,Inbound Telec.Orig.Sevilla,251.0,152.0,63.0,55.0,21.0,542
1,albaaraujo@originaltelecom.es,7.0,3.0,0.0,0.0,0.0,10
2,albertocanto@originaltelecom.es,6.0,5.0,0.0,0.0,0.0,11
3,albertosanchez@originaltelecom.es,11.0,5.0,0.0,0.0,0.0,16
4,anasanchez@originaltelecom.es,0.0,0.0,7.0,6.0,0.0,13
5,antonio.reina@originaltelecom.es,7.0,5.0,0.0,0.0,2.0,14
6,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,7.0,0.0,7
7,beatriz.gomez@originaltelecom.es,7.0,7.0,0.0,0.0,1.0,15
8,carolinafuentes@originaltelecom.es,7.0,3.0,0.0,0.0,1.0,11
9,cesar.arnaldo@originaltelecom.es,0.0,0.0,7.0,4.0,0.0,11


## Load Agents List

Load the list of all agents from the "Agents" sheet.

In [11]:
# Load the list of agents
agents_df = pd.read_excel(excel_file_path, sheet_name='Agents', usecols='A')
agents_list = agents_df.iloc[:, 0].tolist()
print("Agents list loaded successfully!")
print(agents_list)

Agents list loaded successfully!
['albaaraujo@originaltelecom.es', 'albertocanto@originaltelecom.es', 'albertosanchez@originaltelecom.es', 'anasanchez@originaltelecom.es', 'antonio.reina@originaltelecom.es', 'azahara.garcia@originaltelecom.es', 'beatriz.gomez@originaltelecom.es', 'carmen.cornejo@originaltelecom.es', 'carolinafuentes@originaltelecom.es', 'cesar.arnaldo@originaltelecom.es', 'david.molero@originaltelecom.es', 'dolores.cortes@originaltelecom.es', 'elenaborrero@originaltelecom.es', 'estefania.panea@originaltelecom.es', 'formacion10@originaltelecom.es', 'formacion3@originaltelecom.es', 'formacion4@originaltelecom.es', 'francisco.perdomo@originaltelecom.es', 'gonzalofalcon@originaltelecom.es', 'guillermo.hurtado@originaltelecom.es', 'irati.izaguirre@originaltelecom.es', 'ivan.barroso@originaltelecom.es', 'lailasetati@originaltelecom.es', 'laura.eguens@originaltelecom.es', 'leonor.lopez@originaltelecom.es', 'manuelvaldes@originaltelecom.es', 'mar.marchena@originaltelecom.es', 

## Verify and Complete Data

Verify that all agents are present in each DataFrame. If an agent is missing, add a row with zeros for that agent.

In [12]:
def ensure_all_agents(df, agents_list):
    """
    Ensure all agents are present in the DataFrame. Add missing agents with zero values and remove agents not in the list.
    
    Parameters:
    df (pd.DataFrame): The DataFrame to check and update.
    agents_list (list): The list of all agents.
    
    Returns:
    pd.DataFrame: The updated DataFrame with all agents.
    """
    # Get the list of agents in the DataFrame
    existing_agents = df.iloc[:, 0].tolist()
    
    # Find missing agents
    missing_agents = [agent for agent in agents_list if agent not in existing_agents]
    
    # Add rows for missing agents with zero values
    for agent in missing_agents:
        zero_row = pd.DataFrame([[agent] + [0] * (df.shape[1] - 1)], columns=df.columns)
        df = pd.concat([df, zero_row], ignore_index=True)
    
    # Remove agents not in the agents list
    df = df[df.iloc[:, 0].isin(agents_list)]
    
    print(f"Added {len(missing_agents)} missing agents and removed {len(existing_agents) - len(df)} agents not in the list.")
    return df

# Apply the function to each DataFrame
active_df = ensure_all_agents(active_df, agents_list)
canceled_df = ensure_all_agents(canceled_df, agents_list)
pending_signature_df = ensure_all_agents(pending_signature_df, agents_list)
net_df = ensure_all_agents(net_df, agents_list)

# Display the updated DataFrames
print("Active DataFrame after ensuring all agents:")
display(active_df.head(50))

print("Canceled DataFrame after ensuring all agents:")
display(canceled_df.head(50))

print("Pending Signature DataFrame after ensuring all agents:")
display(pending_signature_df.head(50))

print("Net DataFrame after ensuring all agents:")
display(net_df.head(50))

Added 2 missing agents and removed 13 agents not in the list.
Added 10 missing agents and removed 1 agents not in the list.
Added 14 missing agents and removed -9 agents not in the list.
Added 2 missing agents and removed 11 agents not in the list.
Active DataFrame after ensuring all agents:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
1,albaaraujo@originaltelecom.es,11.0,8.0,0.0,0.0,0.0,19
2,albertocanto@originaltelecom.es,9.0,9.0,0.0,0.0,0.0,18
3,albertosanchez@originaltelecom.es,17.0,11.0,0.0,0.0,0.0,28
4,anasanchez@originaltelecom.es,0.0,0.0,24.0,19.0,0.0,43
5,antonio.reina@originaltelecom.es,11.0,11.0,0.0,0.0,5.0,27
6,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,17.0,0.0,17
7,beatriz.gomez@originaltelecom.es,11.0,15.0,0.0,0.0,4.0,30
8,carolinafuentes@originaltelecom.es,8.0,8.0,0.0,0.0,3.0,19
9,cesar.arnaldo@originaltelecom.es,0.0,0.0,20.0,9.0,0.0,29
11,david.molero@originaltelecom.es,13.0,8.0,0.0,0.0,0.0,21


Canceled DataFrame after ensuring all agents:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
1,albertocanto@originaltelecom.es,0.0,5.0,0.0,0.0,0.0,5
2,albertosanchez@originaltelecom.es,2.0,0.0,0.0,0.0,0.0,2
3,anasanchez@originaltelecom.es,0.0,0.0,2.0,0.0,0.0,2
4,antonio.reina@originaltelecom.es,0.0,5.0,0.0,0.0,0.0,5
5,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,3.0,0.0,3
6,beatriz.gomez@originaltelecom.es,1.0,0.0,0.0,0.0,0.0,1
7,carolinafuentes@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1
8,cesar.arnaldo@originaltelecom.es,0.0,0.0,4.0,0.0,0.0,4
9,david.molero@originaltelecom.es,3.0,1.0,0.0,0.0,0.0,4
11,dolores.cortes@originaltelecom.es,5.0,0.0,0.0,0.0,0.0,5


Pending Signature DataFrame after ensuring all agents:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
1,albaaraujo@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1
2,albertocanto@originaltelecom.es,4.0,0.0,0.0,0.0,0.0,4
3,antonio.reina@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1
4,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,1.0,0.0,1
5,carolinafuentes@originaltelecom.es,2.0,0.0,0.0,0.0,0.0,2
6,cesar.arnaldo@originaltelecom.es,0.0,0.0,2.0,3.0,0.0,5
7,david.molero@originaltelecom.es,0.0,2.0,0.0,0.0,0.0,2
8,elenaborrero@originaltelecom.es,1.0,2.0,0.0,0.0,0.0,3
10,formacion10@originaltelecom.es,0.0,1.0,0.0,0.0,1.0,2
11,formacion4@originaltelecom.es,2.0,0.0,1.0,0.0,1.0,4


Net DataFrame after ensuring all agents:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
1,albaaraujo@originaltelecom.es,7.0,3.0,0.0,0.0,0.0,10
2,albertocanto@originaltelecom.es,6.0,5.0,0.0,0.0,0.0,11
3,albertosanchez@originaltelecom.es,11.0,5.0,0.0,0.0,0.0,16
4,anasanchez@originaltelecom.es,0.0,0.0,7.0,6.0,0.0,13
5,antonio.reina@originaltelecom.es,7.0,5.0,0.0,0.0,2.0,14
6,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,7.0,0.0,7
7,beatriz.gomez@originaltelecom.es,7.0,7.0,0.0,0.0,1.0,15
8,carolinafuentes@originaltelecom.es,7.0,3.0,0.0,0.0,1.0,11
9,cesar.arnaldo@originaltelecom.es,0.0,0.0,7.0,4.0,0.0,11
11,david.molero@originaltelecom.es,7.0,4.0,0.0,0.0,0.0,11


## Assign Values to Emails

Assign numerical values to each email and add them as a new column in the DataFrames.

In [13]:
# Dictionary mapping emails to their respective values
email_values = {
    'albaaraujo@originaltelecom.es': 1,
    'albertocanto@originaltelecom.es': 2,
    'albertosanchez@originaltelecom.es': 3,
    'anasanchez@originaltelecom.es': 4,
    'antonio.reina@originaltelecom.es': 5,
    'azahara.garcia@originaltelecom.es': 6,
    'beatriz.gomez@originaltelecom.es': 7,
    'carmen.cornejo@originaltelecom.es': 8,
    'carolinafuentes@originaltelecom.es': 9,
    'cesar.arnaldo@originaltelecom.es': 10,
    'david.molero@originaltelecom.es': 11,
    'dolores.cortes@originaltelecom.es': 22,
    'elenaborrero@originaltelecom.es': 12,
    'estefania.panea@originaltelecom.es': 13,
    'formacion10@originaltelecom.es': 39,
    'formacion3@originaltelecom.es': 24,
    'mar.aguila@originaltelecom.es': 25,
    'formacion4@originaltelecom.es': 30,
    'francisco.perdomo@originaltelecom.es': 14,
    'gonzalofalcon@originaltelecom.es': 15,
    'guillermo.hurtado@originaltelecom.es': 16,
    'irati.izaguirre@originaltelecom.es': 17,
    'ivan.barroso@originaltelecom.es': 18,
    'lailasetati@originaltelecom.es': 20,
    'laura.eguens@originaltelecom.es': 19,
    'leonor.lopez@originaltelecom.es': 21,
    'manuelvaldes@originaltelecom.es': 23,
    'mar.marchena@originaltelecom.es': 33,
    'maria.torres@originaltelecom.es': 28,
    'mariaarroyo@originaltelecom.es': 27,
    'mariangeles.bueso@originaltelecom.es': 26,
    'marta.dorado@originaltelecom.es': 29,
    'miguel.segura@originaltelecom.es': 31,
    'miriam.rodriguez@originaltelecom.es': 32,
    'natividad.sanchez@originaltelecom.es': 34,
    'nereacerezo@originaltelecom.es': 35,
    'oscar.rivilla@originaltelecom.es': 36,
    'patricia.rios@originaltelecom.es': 37,
    'paulavilla@originaltelecom.es': 38,
    'sara.elkhelyfy@originaltelecom.es': 40,
    'sergio.vazquez@originaltelecom.es': 41,
    'yicel.patricia@originaltelecom.es': 42,
    'yzabelly.gomes@originaltelecom.es': 43
}

# Add a new column to each DataFrame with the email values
def add_email_values(df, email_values):
    df['email_value'] = df.iloc[:, 0].map(email_values)
    return df

# Apply the function to each DataFrame
active_df = add_email_values(active_df, email_values)
canceled_df = add_email_values(canceled_df, email_values)
pending_signature_df = add_email_values(pending_signature_df, email_values)

# Display the updated DataFrames with the new 'email_value' column
print("Active DataFrame with email values:")
display(active_df.head(65))

print("Canceled DataFrame with email values:")
display(canceled_df.head(65))

print("Pending Signature DataFrame with email values:")
display(pending_signature_df.head(65))

Active DataFrame with email values:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general,email_value
1,albaaraujo@originaltelecom.es,11.0,8.0,0.0,0.0,0.0,19,1
2,albertocanto@originaltelecom.es,9.0,9.0,0.0,0.0,0.0,18,2
3,albertosanchez@originaltelecom.es,17.0,11.0,0.0,0.0,0.0,28,3
4,anasanchez@originaltelecom.es,0.0,0.0,24.0,19.0,0.0,43,4
5,antonio.reina@originaltelecom.es,11.0,11.0,0.0,0.0,5.0,27,5
6,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,17.0,0.0,17,6
7,beatriz.gomez@originaltelecom.es,11.0,15.0,0.0,0.0,4.0,30,7
8,carolinafuentes@originaltelecom.es,8.0,8.0,0.0,0.0,3.0,19,9
9,cesar.arnaldo@originaltelecom.es,0.0,0.0,20.0,9.0,0.0,29,10
11,david.molero@originaltelecom.es,13.0,8.0,0.0,0.0,0.0,21,11


Canceled DataFrame with email values:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general,email_value
1,albertocanto@originaltelecom.es,0.0,5.0,0.0,0.0,0.0,5,2
2,albertosanchez@originaltelecom.es,2.0,0.0,0.0,0.0,0.0,2,3
3,anasanchez@originaltelecom.es,0.0,0.0,2.0,0.0,0.0,2,4
4,antonio.reina@originaltelecom.es,0.0,5.0,0.0,0.0,0.0,5,5
5,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,3.0,0.0,3,6
6,beatriz.gomez@originaltelecom.es,1.0,0.0,0.0,0.0,0.0,1,7
7,carolinafuentes@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1,9
8,cesar.arnaldo@originaltelecom.es,0.0,0.0,4.0,0.0,0.0,4,10
9,david.molero@originaltelecom.es,3.0,1.0,0.0,0.0,0.0,4,11
11,dolores.cortes@originaltelecom.es,5.0,0.0,0.0,0.0,0.0,5,22


Pending Signature DataFrame with email values:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general,email_value
1,albaaraujo@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1,1
2,albertocanto@originaltelecom.es,4.0,0.0,0.0,0.0,0.0,4,2
3,antonio.reina@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1,5
4,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,1.0,0.0,1,6
5,carolinafuentes@originaltelecom.es,2.0,0.0,0.0,0.0,0.0,2,9
6,cesar.arnaldo@originaltelecom.es,0.0,0.0,2.0,3.0,0.0,5,10
7,david.molero@originaltelecom.es,0.0,2.0,0.0,0.0,0.0,2,11
8,elenaborrero@originaltelecom.es,1.0,2.0,0.0,0.0,0.0,3,12
10,formacion10@originaltelecom.es,0.0,1.0,0.0,0.0,1.0,2,39
11,formacion4@originaltelecom.es,2.0,0.0,1.0,0.0,1.0,4,30


## Sort DataFrames by Email Values

Sort the DataFrames based on the numerical values assigned to the emails.

In [14]:
# Sort each DataFrame by the 'email_value' column
def sort_by_email_value(df):
    df = df.sort_values(by='email_value')
    return df

# Apply the sorting function to each DataFrame
active_df = sort_by_email_value(active_df)
canceled_df = sort_by_email_value(canceled_df)
pending_signature_df = sort_by_email_value(pending_signature_df)

# Display the sorted DataFrames
print("Sorted Active DataFrame:")
display(active_df.head(50))  # Displaying first 20 rows for testing

print("Sorted Canceled DataFrame:")
display(canceled_df.head(50))  # Displaying first 20 rows for testing

print("Sorted Pending Signature DataFrame:")
display(pending_signature_df.head(50))  # Displaying first 20 rows for testing

Sorted Active DataFrame:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general,email_value
1,albaaraujo@originaltelecom.es,11.0,8.0,0.0,0.0,0.0,19,1
2,albertocanto@originaltelecom.es,9.0,9.0,0.0,0.0,0.0,18,2
3,albertosanchez@originaltelecom.es,17.0,11.0,0.0,0.0,0.0,28,3
4,anasanchez@originaltelecom.es,0.0,0.0,24.0,19.0,0.0,43,4
5,antonio.reina@originaltelecom.es,11.0,11.0,0.0,0.0,5.0,27,5
6,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,17.0,0.0,17,6
7,beatriz.gomez@originaltelecom.es,11.0,15.0,0.0,0.0,4.0,30,7
56,carmen.cornejo@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0,8
8,carolinafuentes@originaltelecom.es,8.0,8.0,0.0,0.0,3.0,19,9
9,cesar.arnaldo@originaltelecom.es,0.0,0.0,20.0,9.0,0.0,29,10


Sorted Canceled DataFrame:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general,email_value
44,albaaraujo@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0,1
1,albertocanto@originaltelecom.es,0.0,5.0,0.0,0.0,0.0,5,2
2,albertosanchez@originaltelecom.es,2.0,0.0,0.0,0.0,0.0,2,3
3,anasanchez@originaltelecom.es,0.0,0.0,2.0,0.0,0.0,2,4
4,antonio.reina@originaltelecom.es,0.0,5.0,0.0,0.0,0.0,5,5
5,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,3.0,0.0,3,6
6,beatriz.gomez@originaltelecom.es,1.0,0.0,0.0,0.0,0.0,1,7
45,carmen.cornejo@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0,8
7,carolinafuentes@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1,9
8,cesar.arnaldo@originaltelecom.es,0.0,0.0,4.0,0.0,0.0,4,10


Sorted Pending Signature DataFrame:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general,email_value
1,albaaraujo@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1,1
2,albertocanto@originaltelecom.es,4.0,0.0,0.0,0.0,0.0,4,2
34,albertosanchez@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0,3
35,anasanchez@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0,4
3,antonio.reina@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1,5
4,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,1.0,0.0,1,6
36,beatriz.gomez@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0,7
37,carmen.cornejo@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0,8
5,carolinafuentes@originaltelecom.es,2.0,0.0,0.0,0.0,0.0,2,9
6,cesar.arnaldo@originaltelecom.es,0.0,0.0,2.0,3.0,0.0,5,10


## Remove 'email_value' Column

After sorting the DataFrames based on the email values, the 'email_value' column should be removed to prevent interference with further calculations.

In [15]:
# Function to remove the 'email_value' column
def remove_email_value_column(df):
    if 'email_value' in df.columns:
        df = df.drop(columns=['email_value'])
    return df

# Apply the function to each DataFrame
active_df = remove_email_value_column(active_df)
canceled_df = remove_email_value_column(canceled_df)
pending_signature_df = remove_email_value_column(pending_signature_df)

# Display the updated DataFrames without the 'email_value' column
print("Active DataFrame after removing 'email_value' column:")
display(active_df.head(65))

print("Canceled DataFrame after removing 'email_value' column:")
display(canceled_df.head(65))

print("Pending Signature DataFrame after removing 'email_value' column:")
display(pending_signature_df.head(65))

Active DataFrame after removing 'email_value' column:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
1,albaaraujo@originaltelecom.es,11.0,8.0,0.0,0.0,0.0,19
2,albertocanto@originaltelecom.es,9.0,9.0,0.0,0.0,0.0,18
3,albertosanchez@originaltelecom.es,17.0,11.0,0.0,0.0,0.0,28
4,anasanchez@originaltelecom.es,0.0,0.0,24.0,19.0,0.0,43
5,antonio.reina@originaltelecom.es,11.0,11.0,0.0,0.0,5.0,27
6,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,17.0,0.0,17
7,beatriz.gomez@originaltelecom.es,11.0,15.0,0.0,0.0,4.0,30
56,carmen.cornejo@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0
8,carolinafuentes@originaltelecom.es,8.0,8.0,0.0,0.0,3.0,19
9,cesar.arnaldo@originaltelecom.es,0.0,0.0,20.0,9.0,0.0,29


Canceled DataFrame after removing 'email_value' column:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
44,albaaraujo@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0
1,albertocanto@originaltelecom.es,0.0,5.0,0.0,0.0,0.0,5
2,albertosanchez@originaltelecom.es,2.0,0.0,0.0,0.0,0.0,2
3,anasanchez@originaltelecom.es,0.0,0.0,2.0,0.0,0.0,2
4,antonio.reina@originaltelecom.es,0.0,5.0,0.0,0.0,0.0,5
5,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,3.0,0.0,3
6,beatriz.gomez@originaltelecom.es,1.0,0.0,0.0,0.0,0.0,1
45,carmen.cornejo@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0
7,carolinafuentes@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1
8,cesar.arnaldo@originaltelecom.es,0.0,0.0,4.0,0.0,0.0,4


Pending Signature DataFrame after removing 'email_value' column:


Unnamed: 0,Etiquetas de fila,2024-08-01 00:00:00,2024-08-02 00:00:00,2024-08-03 00:00:00,2024-08-04 00:00:00,2024-08-05 00:00:00,Total general
1,albaaraujo@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1
2,albertocanto@originaltelecom.es,4.0,0.0,0.0,0.0,0.0,4
34,albertosanchez@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0
35,anasanchez@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0
3,antonio.reina@originaltelecom.es,0.0,1.0,0.0,0.0,0.0,1
4,azahara.garcia@originaltelecom.es,0.0,0.0,0.0,1.0,0.0,1
36,beatriz.gomez@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0
37,carmen.cornejo@originaltelecom.es,0.0,0.0,0.0,0.0,0.0,0
5,carolinafuentes@originaltelecom.es,2.0,0.0,0.0,0.0,0.0,2
6,cesar.arnaldo@originaltelecom.es,0.0,0.0,2.0,3.0,0.0,5


## Process Active DataFrame

Sum all numeric values in each row, divide the result by 2, and store the final values.

In [16]:
# Calculate sum of numeric values in each row, divided by 2
active_sums = active_df.iloc[:, 1:].sum(axis=1) / 2  # Assuming the first column is not numeric
print("Calculated sums for 'Active' DataFrame:")
print(active_sums.head(25))

Calculated sums for 'Active' DataFrame:
1     19.0
2     18.0
3     28.0
4     43.0
5     27.0
6     17.0
7     30.0
56     0.0
8     19.0
9     29.0
11    21.0
15    24.0
17    27.0
23    42.0
24    31.0
25    60.0
26    27.0
27     3.0
30    24.0
29    48.0
31    25.0
14    15.0
32    20.0
19    32.0
57     0.0
dtype: float64


## Update finalFile Excel Sheet

Update the "finalFile" Excel sheet with the calculated values from the Active DataFrame.

In [17]:
from openpyxl import load_workbook

# Path to the final file
final_file_path = '/workspaces/Finetwork-Automation/inbound/finalFile.xlsx'
sheet_name = 'GLOBAL AGOSTO'  # Change this to the correct sheet name

def update_final_file(file_path, sheet_name, values):
    # Load the workbook and select the sheet
    workbook = load_workbook(filename=file_path)
    sheet = workbook[sheet_name]
    
    # Start updating from row 3 in column F (6th column)
    start_row = 3
    column = 6  # Column F
    
    for i, value in enumerate(values, start=start_row):
        sheet.cell(row=i, column=column, value=value)
    
    # Save the workbook
    workbook.save(file_path)
    print(f"Updated {len(values)} rows in '{sheet_name}' sheet of '{file_path}'.")

# Update the final file with the calculated sums
update_final_file(final_file_path, sheet_name, active_sums)

Updated 43 rows in 'GLOBAL AGOSTO' sheet of '/workspaces/Finetwork-Automation/inbound/finalFile.xlsx'.


## Process Canceled DataFrame

Sum all numeric values in each row, divide the result by 2, and store the final values in column N starting from row 3.

In [18]:
# Calculate sum of numeric values in each row, divided by 2 for 'Canceled' DataFrame
canceled_sums = canceled_df.iloc[:, 1:].sum(axis=1) / 2  # Assuming the first column is not numeric
print("Calculated sums for 'Canceled' DataFrame:")
print(canceled_sums.head())

Calculated sums for 'Canceled' DataFrame:
44    0.0
1     5.0
2     2.0
3     2.0
4     5.0
dtype: float64


## Update finalFile Excel Sheet with Canceled Data

Update the "finalFile" Excel sheet with the calculated values from the Canceled DataFrame in column N.

In [19]:
def update_final_file_canceled(file_path, sheet_name, values):
    # Load the workbook and select the sheet
    workbook = load_workbook(filename=file_path)
    sheet = workbook[sheet_name]
    
    # Start updating from row 3 in column N (14th column)
    start_row = 3
    column = 14  # Column N
    
    for i, value in enumerate(values, start=start_row):
        sheet.cell(row=i, column=column, value=value)
    
    # Save the workbook
    workbook.save(file_path)
    print(f"Updated {len(values)} rows in '{sheet_name}' sheet of '{file_path}' with Canceled data.")

# Update the final file with the calculated sums for Canceled
update_final_file_canceled(final_file_path, sheet_name, canceled_sums)

Updated 43 rows in 'GLOBAL AGOSTO' sheet of '/workspaces/Finetwork-Automation/inbound/finalFile.xlsx' with Canceled data.


## Process Pending Signature DataFrame

Sum all numeric values in each row, divide the result by 2, and store the final values in column Q starting from row 3.

In [20]:
# Calculate sum of numeric values in each row, divided by 2 for 'Pending Signature' DataFrame
pending_signature_sums = pending_signature_df.iloc[:, 1:].sum(axis=1) / 2  # Assuming the first column is not numeric
print("Calculated sums for 'Pending Signature' DataFrame:")
print(pending_signature_sums.head())

Calculated sums for 'Pending Signature' DataFrame:
1     1.0
2     4.0
34    0.0
35    0.0
3     1.0
dtype: float64


## Update finalFile Excel Sheet with Pending Signature Data

Update the "finalFile" Excel sheet with the calculated values from the Pending Signature DataFrame in column Q.

In [21]:
def update_final_file_pending_signature(file_path, sheet_name, values):
    # Load the workbook and select the sheet
    workbook = load_workbook(filename=file_path)
    sheet = workbook[sheet_name]
    
    # Start updating from row 3 in column Q (17th column)
    start_row = 3
    column = 17  # Column Q
    
    for i, value in enumerate(values, start=start_row):
        sheet.cell(row=i, column=column, value=value)
    
    # Save the workbook
    workbook.save(file_path)
    print(f"Updated {len(values)} rows in '{sheet_name}' sheet of '{file_path}' with Pending Signature data.")

# Update the final file with the calculated sums for Pending Signature
update_final_file_pending_signature(final_file_path, sheet_name, pending_signature_sums)

Updated 43 rows in 'GLOBAL AGOSTO' sheet of '/workspaces/Finetwork-Automation/inbound/finalFile.xlsx' with Pending Signature data.


## Define Useful Columns for Active DataFrame

Determine the range of columns that will be used for the Active DataFrame.

In [25]:
# Determine the range of columns for Active DataFrame
active_usecols = active_df.columns[:-1]  # Exclude the last column
print("Active DataFrame Useful Columns:")
print(active_usecols)

Active DataFrame Useful Columns:
Index(['Etiquetas de fila', 2024-08-01 00:00:00, 2024-08-02 00:00:00,
       2024-08-03 00:00:00, 2024-08-04 00:00:00, 2024-08-05 00:00:00],
      dtype='object')


## Define Useful Columns for Canceled DataFrame

Determine the range of columns that will be used for the Canceled DataFrame.

In [26]:
# Determine the range of columns for Canceled DataFrame
canceled_usecols = canceled_df.columns[:-1]  # Exclude the last column
print("Canceled DataFrame Useful Columns:")
print(canceled_usecols)

Canceled DataFrame Useful Columns:
Index(['Etiquetas de fila', 2024-08-01 00:00:00, 2024-08-02 00:00:00,
       2024-08-03 00:00:00, 2024-08-04 00:00:00, 2024-08-05 00:00:00],
      dtype='object')


## Define Useful Columns for Pending Signature DataFrame

Determine the range of columns that will be used for the Pending Signature DataFrame.

In [27]:
# Determine the range of columns for Pending Signature DataFrame
pending_signature_usecols = pending_signature_df.columns[:-1]  # Exclude the last column
print("Pending Signature DataFrame Useful Columns:")
print(pending_signature_usecols)

Pending Signature DataFrame Useful Columns:
Index(['Etiquetas de fila', 2024-08-01 00:00:00, 2024-08-02 00:00:00,
       2024-08-03 00:00:00, 2024-08-04 00:00:00, 2024-08-05 00:00:00],
      dtype='object')


## Update Diario Agosto Sheet with Active DataFrame

For each useful column in the Active DataFrame, update the corresponding column in the "DIARIO AGOSTO" sheet.

In [29]:
def update_diario_agosto_with_active(df, file_path):
    """
    Update the "DIARIO AGOSTO" sheet with values from the Active DataFrame.
    
    Parameters:
    df (pd.DataFrame): The Active DataFrame.
    file_path (str): Path to the Excel file.
    """
    workbook = load_workbook(filename=file_path)
    sheet = workbook['DIARIO AGOSTO']
    
    # Define the column mappings: Active DataFrame -> DIARIO AGOSTO
    column_mappings = {
        df.columns[1]: 'C',  # Assuming 'B' is the second column
        df.columns[2]: 'H',  # Assuming 'C' is the third column
        df.columns[3]: 'M',  # Assuming 'D' is the fourth column
        df.columns[4]: 'R',  # Assuming 'E' is the fifth column
        df.columns[5]: 'AB'  # Assuming 'F' is the sixth column
    }
    
    for source_col, target_col in column_mappings.items():
        for row_idx, value in enumerate(df[source_col], start=4):
            sheet[f'{target_col}{row_idx}'] = value
    
    workbook.save(file_path)
    print("Updated DIARIO AGOSTO sheet with Active data.")

# Update the Diario Agosto sheet with the Active DataFrame
update_diario_agosto_with_active(active_df, final_file_path)


Updated DIARIO AGOSTO sheet with Active data.
