## 🧹 Step 1–2: Load and Normalize SOT Center Names

We start by loading the dataset containing all active Solid Organ Transplant (SOT) centers from 2010 to 2024.

To ensure consistency and reduce duplication, we normalize known naming variants across transplant centers. For example:
- We remove administrative suffixes such as `", convenzione LAZIO"`.
- We merge entries like `"UD - POLICLINICO UNIVERSITARIO"` into `"UD - AZIENDA OSPEDALIERA S. M. MISERICORDIA"` to consolidate data from the same physical center.

This step reduces the number of unique center names and avoids mismatches during integration with the HCT dataset in the next steps.

The following stats summarize the cleaned dataset:
- Total transplants: `<calculated dynamically>`
- Active regions: `<calculated dynamically>`
- Unique centers: `<calculated dynamically>`

In [1]:
import pandas as pd

# Load datasets
sot_df = pd.read_csv("../data_cleaned/Transplants_Italy_2010_2024_clean.csv")
sot_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6794 entries, 0 to 6793
Data columns (total 8 columns):
 #   Column               Non-Null Count  Dtype 
---  ------               --------------  ----- 
 0   Struttura trapianto  6794 non-null   object
 1   Città                6592 non-null   object
 2   Organo               6794 non-null   object
 3   Sottotipo            6794 non-null   object
 4   Numero               6794 non-null   int64 
 5   Anno                 6794 non-null   int64 
 6   Città_nome           6794 non-null   object
 7   Regione              6794 non-null   object
dtypes: int64(2), object(6)
memory usage: 424.8+ KB


In [2]:
print("Nr of unique 'Strutture trapianto':", sot_df['Struttura trapianto'].nunique())

unique_structures = sot_df['Struttura trapianto'].unique()
count = 0

for structure in unique_structures:
    if 'convenzione' in structure:
        count += 1

print(f"Number of unique 'Struttura trapianto' values containing ', convenzione': {count}")

Nr of unique 'Strutture trapianto': 46
Number of unique 'Struttura trapianto' values containing ', convenzione': 3


In [3]:
sot_df['Struttura trapianto'] =sot_df['Struttura trapianto'].str.replace(', convenzione LAZIO', '', regex=True)
sot_df['Struttura trapianto'] =sot_df['Struttura trapianto'].str.replace('UD - POLICLINICO UNIVERSITARIO',
                                                                 'UD - AZIENDA OSPEDALIERA S. M. MISERICORDIA',
                                                                 regex=True)
sot_df['Struttura trapianto'] =sot_df['Struttura trapianto'].str.replace('TO - AOU Città della Salute, PO OIRM',
                                                                 'TO - AOU Città della Salute, PO S.G.Battista',
                                                                 regex=True)

### 🔍 Step 3: Summarize Coverage
Once normalized, we print key summary statistics:

- Total number of transplants performed.

- Number of active regions.

- Number of unique transplant centers.

- Distribution of centers across regions.

These statistics help verify data completeness and identify any anomalies before merging with HCT data.

In [4]:
# Preview SOT centers
print("🧬 SOT Italy Active Centers (2010-2024):")
print('Total patients transplanted:', sot_df['Numero'].sum())

regions = [x for x in sot_df['Regione'].unique()]
print('Total nr of Regions with Active Centers:', len(regions))

print('Total Active Centers:', sot_df['Struttura trapianto'].nunique())

print("\nNr of Active Centers per Region: \n")
print(sot_df.groupby('Regione', as_index=False)['Struttura trapianto'].nunique()
      .sort_values(by='Struttura trapianto', ascending=False))

🧬 SOT Italy Active Centers (2010-2024):
Total patients transplanted: 50807
Total nr of Regions with Active Centers: 16
Total Active Centers: 41

Nr of Active Centers per Region: 

                  Regione  Struttura trapianto
7               Lombardia                    8
5                   Lazio                    5
2                Campania                    4
15                 Veneto                    4
3          Emilia-Romagna                    3
12                Sicilia                    3
13                Toscana                    3
1                Calabria                    2
9                Piemonte                    2
0                 Abruzzo                    1
4   Friuli Venezia Giulia                    1
6                 Liguria                    1
8                  Marche                    1
10                 Puglia                    1
11               Sardegna                    1
14                 Umbria                    1


#### 🔄 Step 4: Extract and Export HCT Center Names for Manual Harmonization
We extract the distinct list of Hematopoietic Cell Transplant (HCT) center names to prepare for manual name harmonization. This is necessary due to inconsistencies and lack of standardized naming across datasets.

Steps performed:

- Load the pre-processed HCT dataset.

- Select and deduplicate key identifying columns: `region`, `city`, and `full program name`.

- Export the result to an Excel file (`hct_center_names.xlsx`) for manual review and harmonization.

This step ensures that all HCT centers can later be merged accurately with SOT centers based on a unified naming convention.

In [5]:
hct_df = pd.read_excel("../data_cleaned/hct_data_2023_reworked.xlsx")
hct_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 56 entries, 0 to 55
Data columns (total 9 columns):
 #   Column                        Non-Null Count  Dtype 
---  ------                        --------------  ----- 
 0   region                        56 non-null     object
 1   full program name             56 non-null     object
 2   allogeneic total              56 non-null     int64 
 3   matched family donors         56 non-null     int64 
 4   unrelated donors              56 non-null     int64 
 5   haploidentical family donors  56 non-null     int64 
 6   city                          56 non-null     object
 7   patient                       56 non-null     object
 8   Organ                         56 non-null     object
dtypes: int64(4), object(5)
memory usage: 4.1+ KB


In [6]:
# Create a DataFrame for export
hct_names_df = hct_df[['region', 'city', 'full program name']].drop_duplicates()

# Display count
print(f"📌 Total HCT centers: {len(hct_names_df)}")

# Export for manual matching
hct_names_df.to_excel("../data_cleaned/hct_center_names.xlsx", index=False)
print("✅ Exported HCT centers to 'hct_center_names.xlsx'")

📌 Total HCT centers: 56
✅ Exported HCT centers to 'hct_center_names.xlsx'


In [7]:
hct_names_df.head(2)

Unnamed: 0,region,city,full program name
0,Abruzzo,Pescara,248 Ospedale Civile Santo Spirito
1,Calabria,Reggio Calabria,587 Grande Ospedale Bianchi-Melacrino-Morelli


### 📤 Step 5: Export SOT Center Names for Reference and Manual Alignment
In this step, we prepare a clean list of all unique Solid Organ Transplant (SOT) centers as a reference for manual harmonization. This is particularly useful for:

- Aligning naming conventions across datasets.

- Supporting consistent mapping to HCT centers where applicable.

- Building a master list of transplant centers in Italy.

Actions:

- Extract relevant identifying columns: `Regione`, `Città_nome`, and `Struttura trapianto`.

- Remove duplicates to retain only distinct center entries.

- Export the result to `sot_center_names.xlsx` for offline reference or manual editing.

This ensures we maintain naming consistency and simplifies downstream merging processes.



In [8]:
# Extract the full list of SOT center names
sot_names_df = sot_df[['Regione', 'Città_nome', 'Struttura trapianto']].drop_duplicates()
# Export for manual matching
sot_names_df.to_excel("../data_cleaned/sot_center_names.xlsx", index=False)
sot_names_df.head(2)

Unnamed: 0,Regione,Città_nome,Struttura trapianto
0,Piemonte,Novara,NO - AOU MAGGIORE DELLA CARITA' - NOVARA
1,Piemonte,Torino,"TO - AOU Città della Salute, PO S.G.Battista"


### 🧩 Harmonizing and Classifying Transplant Centers
In this step, we integrate harmonized names and classifications for all transplant centers—both Solid Organ Transplant (SOT) and Hematopoietic Cell Transplant (HCT)—to create consistent metadata and enable reliable analysis.

#### 🧬 HCT Centers
- We merge the original `hct_df` dataset with the manually curated harmonized names (`hct_center_names_harmonized.xlsx`).

- Each center is assigned a center_type based on its activity:

    - `SOT+HCT`: centers performing both transplant types,

    - `HCT only`: HCT-exclusive programs.

We then:

- Rename and standardize column names for compatibility,

- Drop irrelevant columns,

- Add transplant subtype (`allo-HCT`) and transplant year (`2023`).

The objective is that both SOTs and HCTs datasets share the following common structure:

```python
region | city | center_name | center_type | patient | organ | subtype | number | year
```

In [9]:
hct_harmonized = pd.read_excel("../data_cleaned/hct_center_names_harmonized.xlsx")
hct_harmonized.head(2)

Unnamed: 0,region,city,full program name,harmonized name,Type,patient
0,Abruzzo,Pescara,248 Ospedale Civile Santo Spirito,PE - OSPEDALE CIVILE SANTO SPIRITO,HCT only,adult only
1,Calabria,Reggio Calabria,587 Grande Ospedale Bianchi-Melacrino-Morelli,RC - OSPEDALE BIANCHI - MELACRINO - MORELLI,SOT+HCT,adult only


In [10]:
hct_harmonized.groupby('Type')['harmonized name'].nunique()

Type
HCT only    23
SOT+HCT     33
Name: harmonized name, dtype: int64

In [11]:
hct_df.head(2)

Unnamed: 0,region,full program name,allogeneic total,matched family donors,unrelated donors,haploidentical family donors,city,patient,Organ
0,Abruzzo,248 Ospedale Civile Santo Spirito,32,8,10,14,Pescara,adult,HCT
1,Calabria,587 Grande Ospedale Bianchi-Melacrino-Morelli,41,15,17,9,Reggio Calabria,adult,HCT


In [12]:
hct_harmonized.head(2)

Unnamed: 0,region,city,full program name,harmonized name,Type,patient
0,Abruzzo,Pescara,248 Ospedale Civile Santo Spirito,PE - OSPEDALE CIVILE SANTO SPIRITO,HCT only,adult only
1,Calabria,Reggio Calabria,587 Grande Ospedale Bianchi-Melacrino-Morelli,RC - OSPEDALE BIANCHI - MELACRINO - MORELLI,SOT+HCT,adult only


In [13]:
# merging the hct_df with the hct_harmonized
hct_merged = pd.merge(
    hct_df,
    hct_harmonized[['full program name', 'harmonized name','Type','patient']],
    on='full program name',
    how='left'
).rename(columns={'harmonized name': 'center_name', 'Type': 'center_type'})

#hct_merged = hct_merged.drop(columns=['full program name'])

hct_merged.head(2)

Unnamed: 0,region,full program name,allogeneic total,matched family donors,unrelated donors,haploidentical family donors,city,patient_x,Organ,center_name,center_type,patient_y
0,Abruzzo,248 Ospedale Civile Santo Spirito,32,8,10,14,Pescara,adult,HCT,PE - OSPEDALE CIVILE SANTO SPIRITO,HCT only,adult only
1,Calabria,587 Grande Ospedale Bianchi-Melacrino-Morelli,41,15,17,9,Reggio Calabria,adult,HCT,RC - OSPEDALE BIANCHI - MELACRINO - MORELLI,SOT+HCT,adult only


In [14]:
hct_merged.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 56 entries, 0 to 55
Data columns (total 12 columns):
 #   Column                        Non-Null Count  Dtype 
---  ------                        --------------  ----- 
 0   region                        56 non-null     object
 1   full program name             56 non-null     object
 2   allogeneic total              56 non-null     int64 
 3   matched family donors         56 non-null     int64 
 4   unrelated donors              56 non-null     int64 
 5   haploidentical family donors  56 non-null     int64 
 6   city                          56 non-null     object
 7   patient_x                     56 non-null     object
 8   Organ                         56 non-null     object
 9   center_name                   56 non-null     object
 10  center_type                   56 non-null     object
 11  patient_y                     56 non-null     object
dtypes: int64(4), object(8)
memory usage: 5.4+ KB


In [15]:
hct_new_cols = {
    'allogeneic total': 'number',
    'Organ': 'organ',
    'patient_y': 'patient'
    }
hct_merged = hct_merged.rename(columns=hct_new_cols)

hct_merged.drop(labels=['full program name', 'matched family donors','unrelated donors',
                        'haploidentical family donors','patient_x'], axis=1, inplace=True)

hct_merged['subtype'] = 'allo-HCT'
hct_merged['year'] = 2023

hct_merged = hct_merged[['region','city', 'center_name','center_type','patient','organ','subtype','number','year']]
hct_merged.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 56 entries, 0 to 55
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   region       56 non-null     object
 1   city         56 non-null     object
 2   center_name  56 non-null     object
 3   center_type  56 non-null     object
 4   patient      56 non-null     object
 5   organ        56 non-null     object
 6   subtype      56 non-null     object
 7   number       56 non-null     int64 
 8   year         56 non-null     int64 
dtypes: int64(2), object(7)
memory usage: 4.1+ KB


### 🧠 SOT Centers
- Similarly, we merge `sot_df` with harmonized names from `sot_center_names_harmonized.xlsx`.

- We map and clean the data by:

    - Renaming columns,

    - Standardizing the schema to match the HCT dataset,

    - Adding classification via the `center_type` field (either `SOT only` or `SOT+HCT`),

    - Ensuring all entries share the same structure for downstream merging.

The objective is that both SOTs and HCTs datasets share the following common structure:

```python
region | city | center_name | center_type | patient | organ | subtype | number | year
```

In [16]:
sot_harmonized = pd.read_excel("../data_cleaned/sot_center_names_harmonized.xlsx")
sot_harmonized.head(2)

Unnamed: 0,Regione,Città_nome,Struttura trapianto,harmonized name,Type,patient
0,Piemonte,Novara,NO - AOU MAGGIORE DELLA CARITA' - NOVARA,NO - AOU MAGGIORE DELLA CARITA' - NOVARA,SOT only,adult only
1,Piemonte,Torino,"TO - AOU Città della Salute, PO S.G.Battista","TO - AOU Città della Salute, PO S.G.Battista",SOT+HCT,adult+pediatric


In [17]:
sot_harmonized.groupby('Type')['harmonized name'].nunique()

Type
SOT only     8
SOT+HCT     33
Name: harmonized name, dtype: int64

In [18]:
len(sot_harmonized), len(hct_harmonized)

(41, 56)

In [19]:
sot_df.head(2)

Unnamed: 0,Struttura trapianto,Città,Organo,Sottotipo,Numero,Anno,Città_nome,Regione
0,NO - AOU MAGGIORE DELLA CARITA' - NOVARA,NO,Rene,Rene,67,2010,Novara,Piemonte
1,"TO - AOU Città della Salute, PO S.G.Battista",TO,Rene,Rene,5,2010,Torino,Piemonte


In [20]:
sot_harmonized.head(2)

Unnamed: 0,Regione,Città_nome,Struttura trapianto,harmonized name,Type,patient
0,Piemonte,Novara,NO - AOU MAGGIORE DELLA CARITA' - NOVARA,NO - AOU MAGGIORE DELLA CARITA' - NOVARA,SOT only,adult only
1,Piemonte,Torino,"TO - AOU Città della Salute, PO S.G.Battista","TO - AOU Città della Salute, PO S.G.Battista",SOT+HCT,adult+pediatric


In [21]:
# merging the sot_df with the sot_harmonized
sot_merged = pd.merge(
    sot_df,
    sot_harmonized[['Struttura trapianto', 'harmonized name','Type','patient']],
    on='Struttura trapianto',
    how='left'
).rename(columns={'harmonized name': 'center_name', 'Type': 'center_type'})

#sot_merged = sot_merged.drop(columns=['Struttura trapianto'])

sot_merged.head(2)

Unnamed: 0,Struttura trapianto,Città,Organo,Sottotipo,Numero,Anno,Città_nome,Regione,center_name,center_type,patient
0,NO - AOU MAGGIORE DELLA CARITA' - NOVARA,NO,Rene,Rene,67,2010,Novara,Piemonte,NO - AOU MAGGIORE DELLA CARITA' - NOVARA,SOT only,adult only
1,"TO - AOU Città della Salute, PO S.G.Battista",TO,Rene,Rene,5,2010,Torino,Piemonte,"TO - AOU Città della Salute, PO S.G.Battista",SOT+HCT,adult+pediatric


In [22]:
sot_merged.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6794 entries, 0 to 6793
Data columns (total 11 columns):
 #   Column               Non-Null Count  Dtype 
---  ------               --------------  ----- 
 0   Struttura trapianto  6794 non-null   object
 1   Città                6592 non-null   object
 2   Organo               6794 non-null   object
 3   Sottotipo            6794 non-null   object
 4   Numero               6794 non-null   int64 
 5   Anno                 6794 non-null   int64 
 6   Città_nome           6794 non-null   object
 7   Regione              6794 non-null   object
 8   center_name          6794 non-null   object
 9   center_type          6794 non-null   object
 10  patient              6794 non-null   object
dtypes: int64(2), object(9)
memory usage: 584.0+ KB


In [23]:
sot_new_cols = {
    'Regione': 'region',
    'Città_nome': 'city',
    'Organo': 'organ',
    'Sottotipo': 'subtype',
    'Numero': 'number',
    'Anno': 'year'
}

sot_merged = sot_merged.rename(columns=sot_new_cols)
sot_merged.drop(labels=['Struttura trapianto', 'Città'], axis=1, inplace=True)
sot_merged = sot_merged[['region','city', 'center_name','center_type','patient','organ','subtype','number','year']]
sot_merged.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6794 entries, 0 to 6793
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   region       6794 non-null   object
 1   city         6794 non-null   object
 2   center_name  6794 non-null   object
 3   center_type  6794 non-null   object
 4   patient      6794 non-null   object
 5   organ        6794 non-null   object
 6   subtype      6794 non-null   object
 7   number       6794 non-null   int64 
 8   year         6794 non-null   int64 
dtypes: int64(2), object(7)
memory usage: 477.8+ KB


### 🧷 Final Integration of Transplant Data
We now concatenate the standardized Solid Organ Transplant (SOT) and Hematopoietic Cell Transplant (HCT) datasets into a single, unified DataFrame:

In [24]:
final_df = pd.concat([sot_merged,hct_merged])
final_df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 6850 entries, 0 to 55
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   region       6850 non-null   object
 1   city         6850 non-null   object
 2   center_name  6850 non-null   object
 3   center_type  6850 non-null   object
 4   patient      6850 non-null   object
 5   organ        6850 non-null   object
 6   subtype      6850 non-null   object
 7   number       6850 non-null   int64 
 8   year         6850 non-null   int64 
dtypes: int64(2), object(7)
memory usage: 535.2+ KB


### ✅ Data Integrity Checks
To ensure consistency:

- We verify the number of unique centers by `center_type` (i.e., `SOT only`, `HCT only`, `SOT+HCT`).

- We resolve any encoding artifacts or character inconsistencies, e.g., replacing non-breaking spaces in names.

In [25]:
# sanity check of the sot_merged and hct_merged concatenation
final_df.groupby(['center_type'],as_index=False)['center_name'].nunique()

Unnamed: 0,center_type,center_name
0,HCT only,23
1,SOT only,8
2,SOT+HCT,34


In [26]:
final_df.groupby(['patient'],as_index=False)['center_name'].nunique()

Unnamed: 0,patient,center_name
0,adult only,46
1,adult+pediatric,14
2,pediatric only,5


In [27]:
final_df['center_name'] =final_df['center_name'].str.replace('TO - AOU Città\xa0 della Salute, PO S.G.Battista',
                                                                 'TO - AOU Città della Salute, PO S.G.Battista',
                                                                 regex=True)

In [28]:
# sanity check of the sot_merged and hct_merged concatenation
final_df.groupby(['center_type'],as_index=False)['center_name'].nunique()

Unnamed: 0,center_type,center_name
0,HCT only,23
1,SOT only,8
2,SOT+HCT,33


In [29]:
final_df.groupby(['patient'],as_index=False)['center_name'].nunique()

Unnamed: 0,patient,center_name
0,adult only,46
1,adult+pediatric,13
2,pediatric only,5


### 📊 Summary Stats
- **Total transplants (SOT only)**: `final_df[final_df['organ'] != 'HCT']['number'].sum()`

- **Total transplants (HCT only)**: `final_df[final_df['organ'] == 'HCT']['number'].sum()`

This `final_df` is now ready for analysis, aggregations, and visualization across centers, organ types, subtypes, and years.

In [30]:
final_df[final_df['organ'] != 'HCT']['number'].sum()

50807

In [31]:
final_df[final_df['organ'] == 'HCT']['number'].sum()

1991

#### 💾 Final Export for Data Visualization
We save the unified and cleaned dataset (`final_df`) in both CSV and Excel formats. The CSV file is optimized for integration with visualization tools such as **Tableau Public**, **Power BI**, or **Google Looker Studio**.

In [32]:
# Export to CSV
final_df.to_csv("../data_final_for_dataviz/transplants_italy_clean.csv", index=False)

# Optional: export to Excel for additional review
final_df.to_excel("../data_final_for_dataviz/transplants_italy_clean.xlsx", index=False)

In [33]:
organ_map = {
    'Rene': 'Kidney',
    'Fegato': 'Liver',
    'Cuore': 'Heart',
    'Polmone': 'Lung',
    'Pancreas': 'Pancreas',
    'Intestino': 'Intestine',
    'HCT': 'HCT'
}

# 1. Convert the dictionary to a Pandas DataFrame
# We'll create a DataFrame from the dictionary items.
# The keys will become one column, and the values will become another.
# We'll explicitly name these columns.
df = pd.DataFrame(list(organ_map.items()), columns=['Italian_Term', 'English_Organ_Name'])

# Display the DataFrame to see its structure
print("--- DataFrame Content ---")
print(df)
print("\n")

# 2. Save the DataFrame as a CSV file
# You can specify the file path and name.
# index=False prevents Pandas from writing the DataFrame index as a column in the CSV.
csv_file_name = '../data_final_for_dataviz/organ_translations.csv'
df.to_csv(csv_file_name, index=False)

print(f"DataFrame successfully saved to {csv_file_name}")

--- DataFrame Content ---
  Italian_Term English_Organ_Name
0         Rene             Kidney
1       Fegato              Liver
2        Cuore              Heart
3      Polmone               Lung
4     Pancreas           Pancreas
5    Intestino          Intestine
6          HCT                HCT


DataFrame successfully saved to ../data_final_for_dataviz/organ_translations.csv


In [34]:
organ_map_extended = {
    'Rene': 'Kidney',
    'Rene doppio': 'Double Kidney',
    'Rene - fegato': 'Kidney - Liver',
    'Rene - pancreas': 'Kidney - Pancreas',
    'Rene - fegato - pancreas': 'Kidney - Liver - Pancreas',
    'Rene - cuore': 'Kidney - Heart',
    'Fegato': 'Liver',
    'Fegato - pancreas': 'Liver - Pancreas',
    'Fegato - pancreas - intestino': 'Liver - Pancreas - Intestine',
    'Cuore - fegato': 'Heart - Liver',
    'Cuore': 'Heart',
    'Cuore - polmone doppio': 'Heart - Double Lung',
    'Polmone': 'Lung',
    'Polmone doppio': 'Double Lung',
    'Pancreas': 'Pancreas',
    'Pancreas - intestino': 'Pancreas - Intestine',
    'Intestino': 'Intestine',
    'Fegato - pancreas - polmone doppio': 'Liver - Pancreas - Double Lung',
    'Fegato - polmone doppio': 'Liver - Double Lung',
    'Rene doppio - fegato': 'Double Kidney - Liver',
    'Rene - polmone': 'Kidney - Lung',
    'Rene - polmone doppio': 'Kidney - Double Lung',
    'Doppio rene - pancreas': 'Double Kidney - Pancreas',
    'allo-HCT': 'Allo-HCT'
}

# 1. Convert the dictionary to a Pandas DataFrame
# We'll create a DataFrame from the dictionary items.
# The keys will become one column, and the values will become another.
# We'll explicitly name these columns.
df = pd.DataFrame(list(organ_map_extended.items()), columns=['Italian_Term', 'English_Organ_Name'])

# Display the DataFrame to see its structure
print("--- DataFrame Content ---")
print(df)
print("\n")

# 2. Save the DataFrame as a CSV file
# You can specify the file path and name.
# index=False prevents Pandas from writing the DataFrame index as a column in the CSV.
csv_file_name = '../data_final_for_dataviz/organ_subtype_translations.csv'
df.to_csv(csv_file_name, index=False)

print(f"DataFrame successfully saved to {csv_file_name}")


--- DataFrame Content ---
                          Italian_Term              English_Organ_Name
0                                 Rene                          Kidney
1                          Rene doppio                   Double Kidney
2                        Rene - fegato                  Kidney - Liver
3                      Rene - pancreas               Kidney - Pancreas
4             Rene - fegato - pancreas       Kidney - Liver - Pancreas
5                         Rene - cuore                  Kidney - Heart
6                               Fegato                           Liver
7                    Fegato - pancreas                Liver - Pancreas
8        Fegato - pancreas - intestino    Liver - Pancreas - Intestine
9                       Cuore - fegato                   Heart - Liver
10                               Cuore                           Heart
11              Cuore - polmone doppio             Heart - Double Lung
12                             Polmone             