# CDM_SOURCE

See [cdm_source](https://ohdsi.github.io/CommonDataModel/cdm54.html#cdm_source). This is a metadata table for future reference of this OMOP instance.

```{mermaid}
erDiagram
    OMOP_CDM_SOURCE {
        varchar(255) cdm_source_name
        varchar(25) cdm_source_abbreviation
        varchar(255) cdm_holder
        varchar(MAX) source_description
        varchar(255) source_documentation_reference
        varchar(255) cdm_etl_reference 
        date source_release_date
        date cdm_release_date
        varchar(10) cdm_version
        integer cdm_version_concept_id
        varchar(20) vocabulary_version
    }
```

The creation of the CDM_SOURCE table is performed by the [genomop_cdm_source.py](../examples/genomop_cdm_source.py) file. It is a simple manual creation of details required for the table.

The script does the following steps:
1. Reads the parameters file.
2. Uses the parameters to fill the table
3. Ensure fields follow the schema for the table.
4. Save the file.


To define it we create a yaml file [genomop_cdm_source_params.yaml](../examples/genomop_cdm_source_params.yaml), which has the following structure:

```YAML
output_dir: preomop/04_omop_intermediate/CDM_SOURCE/
cdm_source_fields:
  cdm_source_name: package
  cdm_source_abbreviation: package
  cdm_holder: FPS
  source_description: Instancia OMOP para el estudio de datos de <proyecto>.
  source_documentation_reference: ""
  cdm_etl_reference: v1.0
  source_release_date: 2023-12-14
  cdm_release_date: 2025-01-21
  cdm_version: "5.4"
  cdm_version_concept_id: 756265
  vocabulary_version: v20240830
```

The neccesary params are:
- `output_dir` is the path from `data_dir` to the directory where data will be saved to.
- `cdm_source_fields` define the fields needed to populate the CDM_SOURCE table, see [cdm_source](http://ohdsi.github.io/CommonDataModel/cdm54.html#cdm_source):
  - `cdm_source_name`: The name of the CDM instance. 
  - `cdm_source_abbreviation`: The abbreviation of the CDM instance. 
  - `cdm_holder`: The holder of the CDM instance. 
  - `source_description`: The description of the CDM instance
  - `source_documentation_reference`: Link to documentation of the CDM instance
  - `cdm_etl_reference`: Version of the ETL script used. e.g. link to the Git release 
  - `source_release_date`: The date the data was extracted from the source system. In some systems that is the same as the date the ETL was run. Typically the latest even date in the source is on the source_release_date.
  - `cdm_release_date`: The date the ETL script was completed. Typically this is after the source_release_date.
  - `cdm_version`: Version of the OMOP CDM used as string. e.g. "5.4"
  - `cdm_version_concept_id`: The Concept Id representing the version of the CDM. (Usually: 756265)
  - `vocabulary_version`: Version of the OMOP standardised vocabularies loaded. (Usually: v20240830)

The parameter `output_dir` is defined in relation to the `data_dir` folder defined in the `.env` file.