# PERSON

See [person](https://ohdsi.github.io/CommonDataModel/cdm54.html#person). The OMOP PERSON table has the fields shown in the following diagram:

```{mermaid}
erDiagram
    OMOP_PERSON {
        integer person_id
        varchar(50) person_source_value
        integer gender_concept_id
        varchar(50) gender_source_value
        integer gender_source_concept_id
        integer year_of_birth 
        integer month_of_birth
        integer day_of_birth
        datetime birth_datetime
        integer race_concept_id
        varchar(50) race_source_value
        integer race_source_concept_id
        integer ethnicity_concept_id
        varchar(50) ethnicity_source_value
        integer ethnicity_source_concept_id
        integer location_id
        integer provider_id
        integer care_site_id
    }
```

The script that handles the transformation from the original file to OMOP format is [genomop_person.py](../examples/genomop_person.py). 

Basically it:
1. Reads the parameters file.
2. Iterate over the files (See parameter `input_files`).
   1. Read the table
   2. Assigns the date columns.
   3. Rename specific columns to fit the omop standard (See parameter `column_name_map`)
   4. Apply custom mappings if needed (See parameter `column_values_map`).
   5. Fills the rest of the columns.
   6. Save the file.
 
The only thing necessary is to define the file where the information related to each patient is stored, typically the corresponding sociodemo file.



To define it we create a yaml file [genomop_person_params.yaml](../examples/genomop_person_params.yaml), which has the following structure:

```YAML
input_dir: preomop/03_omop_initial/
output_dir: preomop/04_omop_intermediate/PERSON/
input_files:
  - 01_sociodemo.parquet
column_map:
  01_sociodemo.parquet:
    desc_sexo: gender_source_value
concept_id_map:
  01_sociodemo.parquet:
    gender_source_value:
      Mujer: 8532
      Hombre: 8507
```

The neccesary params are:
- `input_dir` is the path from `data_dir` to the directory where input data is.
- `output_dir` is the path from `data_dir` to the directory where data will be saved to.
- `input_files` is the list of files to be used.
- `column_map` is a dictionary that defines, for each file in `input_list`, the relation from original names to their omop counterparts. i.e. original_names: omop_name.
- `concept_id_map` is a dictionary, for each file in `input_list`, and, for each **new omop column**, the mapping from original values to omop standard codes.

In this provided example, we will map the values "Mujer" and "Hombre", which appear in the "gender_source_value" column, to the OMOP codes 8532 and 8507, respectively.