# SystemicTreatment to OMOP

![First image](images/SystemicTreatmentIDEA1.png)

In omop we have 2 kind of treatments: the individual treatment that are standard omop treatment models that map on drug exposure and procedure occurrence.

Systemic Treatment in IDEA4RC includes the information about regimen and cycles. This information in omop is a composite treatment, that is modelled in the episode table as a nested episode
![Second image](images/SystemicTreatmentIDEA2.png)


### Doubts

How do we map regimens and cycles?

In Idea4rc DM systemic treatment includes info about regimens and cycles and DrugsforTreatment the information about the drugs that are the composite treatment model of omop.

To map in omop, we can assume that:
We have the interval of the regimen in systematic treatment so we divide that interval by the number of cycles, and we get the remaining dates needed to create the nested treatment episodes that represent the cycles. If we have the drug associated with the systematic treatment, as default action we create a drug exposure for each episode cycle and set the drug exposure date in the middle of each cycle. Drug exposure and episodes are linked through the episode event omop table

Systemic treatment is mainly mapped to episode, measurement, procedure occurrence and visit occurrence (as it can be seen in drugs for treatments, since this visit occurrence is needed), following: https://docs.google.com/spreadsheets/d/1Vw1Dr2K4oG__cDQTutGaJhZvGUvQTLwc4qWreP6qMSs/edit?gid=252770424#gid=252770424

We need visit concept id and visit type concept id

For each systemic treatment, to be able to have in omop the information of the cycles, we will get the total duration of the treatment/number of cycles as the duration of each cycle.

In [None]:
import pandas as pd
import psycopg2
import random
import numpy as np 
from datetime import datetime, timedelta
import mysql.connector
from uuid import uuid4


In [None]:
systemic_treatment_IDEA4RC = pd.read_csv("./IDEA4RC-data/systemicTreatmentIDEA4RC.csv")
systemic_treatment_IDEA4RC.head(5)

In [None]:
conn = psycopg2.connect(
    dbname="omopdb",
    user="postgres",
    password="mysecretpassword",
    host="localhost",
    port="5432"
)

cur = conn.cursor()
config = {
    'user': 'user', 
    'password': 'password',
    'host': '127.0.0.1',
    'database': 'idea4rc_dm',
    'raise_on_warnings': True
}

conn2 = mysql.connector.connect(**config)
curIDEA = conn2.cursor()

### Systemic Treatment to Episode

Since it is not clear what to do with episde, we will try to create a new one, and if exists we will simply change the data

episode_type_concept_id??? I am using a value that is not correct.

In [None]:
df_tables=systemic_treatment_IDEA4RC
df_tables['idThisEvent'] = None
sql = """
INSERT INTO EPISODE (episode_id,person_id, episode_start_date, episode_end_date, episode_concept_id,episode_type_concept_id, parent_episode_id)
VALUES (%s,%s,%s,%s,%s,%s,%s)
"""

queryPerson="""
    SELECT e.patient_id FROM cancer_episode e
    INNER JOIN episode_event v ON e.id=v.cancer_episode
    INNER JOIN systemic_treatment st ON st.episode_event=v.id
    WHERE st.id=%s
"""    

queryDate="""
    SELECT episode_start_date
    FROM cancer_episode
    WHERE id = %s
"""
sqlGetEpisode = """
    SELECT c.id 
    FROM episode c
    WHERE c.person_id = %s
    AND c.episode_start_date = %s
    LIMIT 1
"""

queryPerson= """
    SELECT c.patient 
    FROM cancer_episode c
    WHERE EXISTS (
        SELECT 1 
        FROM EpisodeEvent e
        WHERE e.cancerEpisode = c.id
        AND e.id = %s
    )
    LIMIT 1
"""


    
for idx, row in df_tables.iterrows():
    curIDEA.execute(queryPerson, (row['id']))
    res=curIDEA.fetchone()
    person_id=res[0]
    curIDEA.execute(queryDate, (row['Episode Event Reference']))
    res=curIDEA.fetchone()
    dateAux=res
    cur.execute(sqlGetEpisode, (person_id,dateAux))
    parent_episode_id=cur.fetchone()
    episode_start_date=res
    newId=datetime.now().strftime('%Y%m%d%H%M%S')+ str(uuid4())
    row['idThisEvent']=newId
    cur.execute(sql,(newId,person_id, row['startdate'], row['enddate'],row['treatmentResponse'],row['treatmentResponse'],parent_episode_id))
    conn.commit()


### Systemic Treatment to Procedure Ocurrence

In [None]:

df_tables=systemic_treatment_IDEA4RC
queryEVPO="""
    INSERT INTO omopcdm.episode_event (episode_id,event_id,episode_event_field_concept_id)
    VALUES (%s, %s, %s)
"""

sqlProcedure="""
    INSERT INTO omopcdm.procedure_ocurrence (procedure_occurrence_id,person_id,procedure_concept_id,procedure_date,procedure_end_date,procedure_type_concept_id)
    VALUES (%s, %s, %s,%s,%s)
"""
query= """
    SELECT c.patient 
    FROM cancer_episode c
    WHERE EXISTS (
        SELECT 1 
        FROM EpisodeEvent e
        WHERE e.cancerEpisode = c.id
        AND e.id = %s
    )
    LIMIT 1

"""
columnsPO = [
    "Type of systemic treatment",
    "Intent",
    "Setting",
    "Chemotherapy info"
]
def toTableEpisodeEventProcedureOcurrence(idEpisode , idProcedureOcurrence):
    cur.execute(queryEVPO,(idEpisode,idProcedureOcurrence,1147082)) #1147082 or 1147810, not clear
    conn.commit()

for idx, row in df_tables.iterrows():
    curIDEA.execute(query, (row['Episode Event Reference']))
    res=curIDEA.fetchone()
    person_id=res
    procedure_date=row['Start date regimen changed']
    procedre_end_date=row['End date regimen changed']
    procedure_type_concept_id=0 #Still dont know what to do with this one
    for column in columnsPO:
        procedure_concept_id=row[column]
        newId = datetime.now().strftime('%Y%m%d%H%M%S') + str(uuid4())
        toTableEpisodeEventProcedureOcurrence(row['idThisEvent'],newId)
        cur.execute(sqlProcedure,(newId, person_id, procedure_concept_id,procedure_date,procedre_end_date,procedure_type_concept_id))
        conn.commit()

### Systemic Treatment to Measurement

Measurement concept id???

meas_event_field_concept_id???

In [None]:
sqlMeasuement = """
    INSERT INTO omopcdm.measurement (person_id, measurement_concept_id, measurement_date, measurement_type_concept_id, value_as_number,measurement_event_id,meas_event_field_concept_id)
    VALUES (%s, %s, %s,%s,%s,%s,%s)
    """
measurement_type_concept_id=38000280
measurement_concept_id=0
for idx, row in df_tables.iterrows():
    curIDEA.execute(query, (row['Episode Event Reference']))
    res=curIDEA.fetchone()
    person_id=res
    date=row['Start date regimen changed']
    value_as_number=row['Number of cycles/ administrations']
    measurement_value=row['Number of cycles/ administrations']   #??? the same as value as number
    measurement_event_id=row['idThisEvent']
    cur.execute(sqlMeasuement,(person_id, measurement_concept_id, date, measurement_type_concept_id,value_as_number,measurement_event_id,meas_event_field_concept_id))
    conn.commit()

### Systemic Treatment to Visit Occurrence

We need visit concept id and visit type concept id


In [None]:


insertVisitOccurrence="""
    INSERT INTO omopcdm.visit_occurrence (visit_occurrence_id, person_id, visit_concept_id, visit_start_date, visit_end_date, visit_type_concept_id)
    VALUES (%s, %s, %s, %s, %s, %s)
"""

def toTableEpisodeEventVisit(idEpisode , idVisitOccurrence):
    cur.execute(queryEVPO,(idEpisode,idVisitOccurrence,1147070))
    conn.commit()

for idx, row in df_tables.iterrows():
    cur.execute(queryPerson, (row['episodeEvent']))
    res=cur.fetchone()
    person_id=res
    visit_concept_id=0 #Still dont know what to do with this one
    visit_start_date=row['startDate'] #Should i be using this one or not?
    visit_end_date=row['endDate']
    visit_type_concept_id=0 #Still dont know what to do with this one
    newId = datetime.now().strftime('%Y%m%d%H%M%S') + str(uuid4())
    toTableEpisodeEventVisit(row['idThisEvent'],newId)
    cur.execute(insertVisitOccurrence,(newId, person_id, visit_concept_id,visit_start_date,visit_end_date,visit_type_concept_id))
    conn.commit()



# Other IDEA4RC treatment tables

https://docs.google.com/spreadsheets/d/1Vw1Dr2K4oG__cDQTutGaJhZvGUvQTLwc4qWreP6qMSs/edit?gid=449874117#gid=449874117

![First image](images/OtherTreatmentIDEA1.png)

Other idea4rc treatment tables: surgery, radiotherapy, ect. (green highlighted tables in the DM) are mapped on drug exposure or procedure occurrence and linked with episodes through episode events (ask this)

The two dates are included because the treatment regimen might start and end on different dates due to possible toxicity issues or other problems. Therefore, a second date was added for such cases.

For the drugs, you query DrugsForTreatments using the ID of the treatment regimen.

It's assumed that toxicity is the main reason for changes in the regimen, but it may also change for other reasons, though the exact conditions are not clear. Toxicity is the primary concern, but for clarification on how to model this situation in OMOP, Paolo will be consulted.

We need Disease extent (for example for a query as "time to metastasis"): confined, invasive, metastatic, but we still do not know very clearly how to get this from IDEA4RC data. For sarcoma, it should be the stage for example. But for H&N this is not that clear.

Maybe we could extend EpisodeEvent so it includes not only the temporal evolution but also all the episode events from omop https://athena.ohdsi.org/search-terms/terms?domain=Episode&page=1&pageSize=15&query= . Does this make sense?

maybe for H&N we sould get it from STAGING??? This is staging:
Stage I: Often refers to confined cancer, where it is small and localized to its site of origin, without spreading.
Stage II-III: Typically indicate invasive cancer, where the tumor has grown into surrounding tissues or nearby lymph nodes.
Stage IV: Refers to metastatic cancer, where the disease has spread to distant parts of the body.