# 📝 Cambiar Nombres de Columnas en el Dataset de Predicciones

Este notebook carga el archivo `final_historical_with_xgb_predictions.csv`, renombra las columnas para que sean compatibles con Looker Studio, y guarda el archivo actualizado.


In [3]:
# Importar librerías necesarias
import pandas as pd

# Cargar el archivo CSV desde la nueva ruta
file_path = r"C:\Users\DYLAN\Desktop\REPOS\MachineLearning\lead_scoring_TFG\prediccion_de_datos_sinteticos_ML\final_historical_with_xgb_predictions.csv"
df = pd.read_csv(file_path)

#  Ver las primeras filas para entender la estructura actual
df.head()


Unnamed: 0,Lead Number,Do Not Email,Do Not Call,Total Time Spent on Website,Page Views Per Visit,How did you hear about X Education,What matters most to you in choosing a course,Search,Magazine,Newspaper Article,...,City_Other Metro Cities,City_Select,City_Thane & Outskirts,City_Tier II Cities,Last Notable Activity_Email Opened,Last Notable Activity_Modified,Last Notable Activity_Other Activity,Last Notable Activity_SMS Sent,xgb_predictions,xgb_probabilities
0,738306,morrowmichael@example.net,False,3,0.13,Select,Better Career Prospects,False,False,False,...,0,0,0,0,1,0,0,0,0,0.382035
1,402310,longjessica@example.com,False,2,0.4,Student of SomeSchool,,False,False,False,...,0,0,0,0,0,1,0,0,0,0.291451
2,737279,xlang@example.org,False,3,0.02,Select,,False,False,False,...,0,1,0,0,0,1,0,0,0,0.381659
3,115873,patricia58@example.net,False,2,0.12,Select,,False,False,False,...,0,0,0,0,0,1,0,0,0,0.251681
4,377118,vgreen@example.com,False,3,0.1,Select,Better Career Prospects,False,False,False,...,0,1,0,0,0,1,0,0,0,0.190847


##  Renombrar Columnas

Vamos a reemplazar los espacios en los nombres de las columnas por guiones bajos (`_`), para que sean compatibles con Looker Studio.


In [4]:
# Renombrar columnas automáticamente
df.columns = df.columns.str.replace(' ', '_')

# Verificar los nuevos nombres de las columnas
print(df.columns.tolist())


['Lead_Number', 'Do_Not_Email', 'Do_Not_Call', 'Total_Time_Spent_on_Website', 'Page_Views_Per_Visit', 'How_did_you_hear_about_X_Education', 'What_matters_most_to_you_in_choosing_a_course', 'Search', 'Magazine', 'Newspaper_Article', 'X_Education_Forums', 'Newspaper', 'Digital_Advertisement', 'Through_Recommendations', 'Receive_More_Updates_About_Our_Courses', 'Update_me_on_Supply_Chain_Content', 'Get_updates_on_DM_Content', 'Lead_Profile', 'I_agree_to_pay_the_amount_through_cheque', 'A_free_copy_of_Mastering_The_Interview', 'Date', 'TotalVisits', 'Average_Time_Per_Visit', 'rn', 'Lead_Origin_Landing_Page_Submission', 'Lead_Origin_Lead_Add_Form', 'Lead_Origin_Lead_Import', 'Lead_Origin_Quick_Add_Form', 'Lead_Source_Google', 'Lead_Source_Olark_Chat', 'Lead_Source_Organic_Search', 'Lead_Source_Other_Source', 'Last_Activity_Email_Opened', 'Last_Activity_Olark_Chat_Conversation', 'Last_Activity_Other_Last_Activity', 'Last_Activity_Page_Visited_on_Website', 'Last_Activity_SMS_Sent', 'Specializ

In [9]:
#Renombrar la columna problemática
df = df.rename(columns={
    'Specialization_Banking,_Investment_And_Insurance': 'Spec_Bank_Inv_Ins'
})

In [10]:
# Verificar los nuevos nombres de las columnas
print(df.columns.tolist())

['Lead_Number', 'Do_Not_Email', 'Do_Not_Call', 'Total_Time_Spent_on_Website', 'Page_Views_Per_Visit', 'How_did_you_hear_about_X_Education', 'What_matters_most_to_you_in_choosing_a_course', 'Search', 'Magazine', 'Newspaper_Article', 'X_Education_Forums', 'Newspaper', 'Digital_Advertisement', 'Through_Recommendations', 'Receive_More_Updates_About_Our_Courses', 'Update_me_on_Supply_Chain_Content', 'Get_updates_on_DM_Content', 'Lead_Profile', 'I_agree_to_pay_the_amount_through_cheque', 'A_free_copy_of_Mastering_The_Interview', 'Date', 'TotalVisits', 'Average_Time_Per_Visit', 'rn', 'Lead_Origin_Landing_Page_Submission', 'Lead_Origin_Lead_Add_Form', 'Lead_Origin_Lead_Import', 'Lead_Origin_Quick_Add_Form', 'Lead_Source_Google', 'Lead_Source_Olark_Chat', 'Lead_Source_Organic_Search', 'Lead_Source_Other_Source', 'Last_Activity_Email_Opened', 'Last_Activity_Olark_Chat_Conversation', 'Last_Activity_Other_Last_Activity', 'Last_Activity_Page_Visited_on_Website', 'Last_Activity_SMS_Sent', 'Spec_Bank

## Guardar el CSV Actualizado

Guardaremos el archivo actualizado con un nuevo nombre para mantener el original intacto.


In [11]:
# Guardar el CSV con los nuevos nombres de columna
output_path = "C:/Users/DYLAN/Desktop/REPOS/MachineLearning/final_historical_with_xgb_predictions_renamed.csv"
df.to_csv(output_path, index=False)

print(f"Archivo guardado correctamente en: {output_path}")


Archivo guardado correctamente en: C:/Users/DYLAN/Desktop/REPOS/MachineLearning/final_historical_with_xgb_predictions_renamed.csv
