# MEASUREMENT

See [measurement](https://ohdsi.github.io/CommonDataModel/cdm54.html#measurement). 

This table contains the records of measurements or tests performed on the patient. It contains both the measurement order and the measurement results. It has a key:value structure. Here goes any “observation” of the patient for which it was necessary to perform a test. 

```{mermaid}
erDiagram
    OMOP_MEASUREMENT {
        integer measurement_id
        integer person_id
        integer measurement_concept_id
        date measurement_date
        datetime measurement_datetime
        varchar(10) datetime
        integer measurement_type_concept_id
        integer operator_concept_id
        float value_as_number
        integer value_as_concept_id
        integer unit_concept_id
        float range_low
        float range_high
        integer provider_id
        integer visit_occurrence_id
        integer visit_detail_id
        varchar(50) measurement_source_value
        integer measurement_source_concept_id
        varchar(50) unit_source_value
        integer unit_source_concept_id
        varchar(50) value_source_value
        integer measurement_event_id
        integer meas_event_field_concept_id
    }
```

The execution of the transformation is carried out by the file [genomop_measurement.py](../examples/genomop_measurement.py). 

This script performs the following steps:
1. Loads parameters
2. Loads vocabulary tables
   - In this case it also loads the CLC codes to map the measurement codes in the MPA module of BPS. 
3. Loads each file
   1. Assign a new `vocabulary_id` column if needed. See `append_vocabulary` parameter.
   2. Rename the column containing the conditions names or codes to `source_value`. See `column_map` parameter.
   3. Maps source codes to a specific column in the CONCEPT table to retrieve the `source_concept_id`. See `vocabulary_config` parameter.
4. Creates a single table with all files.
5. Ensure numeric values are numeric.
6. Tries to map measurement units. See `create_vocabulary_mapping()` in module [general.py](../bps_to_omop/general.py) and `map_measurement_units()` in module [measurement.py](../bps_to_omop/measurement.py),
   - If some mappings cannot be performed, a warning will be prompted.
7. Map source_concept_id codes to concept_id. 
   - Both `measurement_source_concept_id` and `unit_source_concept_id` will be mapped.
8. Check for codes that were not mapped.
   - Parameters `unmapped_measurement` and `unmapped_unit` can be used to define additional mappings to be applied for measurement codes and units, respectively.
9. Creates primary key (`measurement_id`)
10. Finds any entries that are contained within a visit in the VISIT_OCCURRENCE table and assigns the corresponding `visit_occurrence_id`.
11. Adapt the table to the schema of the MEASUREMENT table.
12. Saves the omop table to the defined output folder.
 

The configuration file will be [genomop_measurement_params.yaml](../examples/genomop_measurement_params.yaml). It must have the following structure:

```yaml
input_dir: preomop/03_omop_initial/
output_dir: preomop/04_omop_intermediate/MEASUREMENT/
input_files:
  - 03_MPA.parquet
vocab_dir: raw/omop_vocab/
visit_dir: preomop/04_omop_intermediate/VISIT_OCCURRENCE/
append_vocabulary:
  03_MPA.parquet: CLC
column_map:
  03_MPA.parquet:
    desc_clc: measurement_source_value
    valor_convencional: value_source_value
vocabulary_config:
  03_MPA.parquet: 
    CLC: 
value_map:
  03_MPA.parquet: numeric
unmapped_measurement:
  0: 0
unmapped_unit:
  "x 10^3/µL": 8848
```

The parameters are:
- `input_dir` is the path from `data_dir` to the directory where input data is.
- `output_dir` is the path from `data_dir` to the directory where data will be saved to.
- `input_files` is the list of files, as paths from `data_dir / input_dir`, to be used.
- `vocab_dir` is the path from `data_dir` to the directory where the vocabulary tables (CONCEPT, CONCEPT_RELATIONSHIP, etc.) are.
- `visit_dir` is the path from `data_dir` to the directory where the VISIT_OCCURRENCE table is.
- `append_vocabulary` is a dict that defines, for each file in `input_files`, the name of the vocabulary to be added as a new uniform column.
  - i.e. a new column, `vocabulary_id`, will be added to the table with the provided value for every row.
  - This entry is not mandatory.
- `column_map` is a dict that defines, for each file in `input_files`, the column that will be renamed to `source_value` to perform the identification of codes in the CONCEPT table.
  - It can be used to rename any other column if needed.
- `vocabulary_config` is a dict that defines, for each file in `input_files`, how to perform the mapping from each vocabulary to the concept_table
  - Defines a map between each vocabulary present in the file (key) and what column in the CONCEPT table, i.e. concept_name or concept_code, should be used for mapping (value).
  - Each source_value has to be mapped to a concept in the CONCEPT table. Usually the source values are either descriptors (which map to concept_name), or codes (which map to concept_code)
- `value_map` defines the type of data that that file contains. Can be either 'numeric' or 'concept'.
  - 'numeric' will use the column `value_source_value` and ensure it is numeric. It will create a column `value_source_concept_id` with nans.
  - 'concept' will use the columns `value_source_value` and `value_vocabulary_id` to find the equivalent omop code. It will create a column `value_as_number` with nans.
  - The columns `value_source_value` and `value_vocabulary_id` have to be created beforehand.
- `unmapped_measurement` is a dict that maps each **measurement** source code (key) to the standard omop code (value). Can be used in case the source_value or source_concept_id columns do not have map in the CONCEPT_RELATIONSHIP table.
- `unmapped_unit` is a dict that maps each **unit** source code (key) to the standard omop code (value). Can be used in case the source_value or source_concept_id columns do not have map in the CONCEPT_RELATIONSHIP table.

Important notes:
- The parameters `input_dir` and `output_dir` are defined in relation to the `data_dir` folder defined in the `.env` file. 
- Even though mapping from multiple vocabularies is allowed with the `vocabulary_config` parameter, the assignation of multiple `vocabulary_id` values cannot currently be performed here and has to be done in the `process_rare_files` stage.