# 🏥 NeMo Data Designer: Clinical Trials Dataset Generator

#### 📚 What you'll learn

This notebook demonstrates how to use structured samplers, person/PII generators, and LLMs to create a realistic\
synthetic clinical trials dataset—including trial metadata, participant demographics, investigator details,\
clinical notes, and adverse event reports—for evaluating data protection and anonymization techniques.

<br>

> 👋 **IMPORTANT** – Environment Setup
>
> - If you haven't already, follow the instructions in the [README](../../../README.md) to install the necessary dependencies.
>
> - You may need to restart your notebook's kernel after setting up the environment.
> - In this notebook, we assume you have a self-hosted instance of Data Designer up and running.
>
> - For deployment instructions, see the [Installation Options](https://docs.nvidia.com/nemo/microservices/latest/design-synthetic-data-from-scratch-or-seeds/index.html#installation-options) section of the [NeMo Data Designer documentation](https://docs.nvidia.com/nemo/microservices/latest/design-synthetic-data-from-scratch-or-seeds/index.html).


### 📦 Import the essentials

- The `data_designer` module of `nemo_microservices` exposes Data Designer's high-level SDK.

- The `essentials` module provides quick access to the most commonly used objects.


In [None]:
from nemo_microservices.data_designer.essentials import (
    BernoulliSamplerParams,
    CategorySamplerParams,
    DataDesignerConfigBuilder,
    GaussianSamplerParams,
    InferenceParameters,
    LLMTextColumnConfig,
    ModelConfig,
    NeMoDataDesignerClient,
    PersonSamplerParams,
    SamplerColumnConfig,
    SamplerType,
    SubcategorySamplerParams,
    UUIDSamplerParams,
    UniformSamplerParams,
)

### ⚙️ Initialize the NeMo Data Designer Client

- `NeMoDataDesignerClient` is responsible for submitting generation requests to the microservice.


In [None]:
NEMO_MICROSERVICES_BASE_URL = "http://localhost:8080"

data_designer_client = NeMoDataDesignerClient(base_url=NEMO_MICROSERVICES_BASE_URL)

### 🎛️ Define model configurations

- Each `ModelConfig` defines a model that can be used during the generation process.

- The "model alias" is used to reference the model in the Data Designer config (as we will see below).

- The "model provider" is the external service that hosts the model (see [the model config docs](https://docs.nvidia.com/nemo/microservices/latest/design-synthetic-data-from-scratch-or-seeds/configure-models.html) for more details).

- By default, the microservice uses [build.nvidia.com](https://build.nvidia.com/models) as the model provider.


In [None]:
# This name is set in the microservice deployment configuration.
MODEL_PROVIDER = "nvidiabuild"

# The model ID is from build.nvidia.com.
MODEL_ID = "nvidia/nvidia-nemotron-nano-9b-v2"

# We choose this alias to be descriptive for our use case.
MODEL_ALIAS = "nemotron-nano-v2"

# This sets reasoning to False for the nemotron-nano-v2 model.
SYSTEM_PROMPT = "/no_think"

model_configs = [
    ModelConfig(
        alias=MODEL_ALIAS,
        model=MODEL_ID,
        provider=MODEL_PROVIDER,
        inference_parameters=InferenceParameters(
            temperature=0.6,
            top_p=0.95,
            max_tokens=1024,
        ),
    )
]

### 🏗️ Initialize the Data Designer Config Builder

- The Data Designer config defines the dataset schema and generation process.

- The config builder provides an intuitive interface for building this configuration.

- The list of model configs is provided to the builder at initialization.


In [None]:
config_builder = DataDesignerConfigBuilder(model_configs=model_configs)

## 🎲 Getting Started with Sampler Columns

- Sampler columns offer non-LLM based generation of synthetic data.

- They are particularly useful for **steering the diversity** of the generated data, as we demonstrate below.

- The persona samplers allow you to sample realistic details of individuals using a model trained on the US Census.\
  If the locale of the persona you are generating is anything other than `en_US`, then the personas will be generated using Faker


In [None]:
# Create person samplers for different roles, using en_GB locale
# Add person samplers for different roles in the clinical trial
config_builder.add_column(
    SamplerColumnConfig(
        name="participant",
        sampler_type=SamplerType.PERSON,
        params=PersonSamplerParams(locale="en_US"),
    )
)

config_builder.add_column(
    SamplerColumnConfig(
        name="investigator",
        sampler_type=SamplerType.PERSON,
        params=PersonSamplerParams(locale="en_US"),
    )
)

config_builder.add_column(
    SamplerColumnConfig(
        name="coordinator",
        sampler_type=SamplerType.PERSON,
        params=PersonSamplerParams(locale="en_US"),
    )
)

config_builder.add_column(
    SamplerColumnConfig(
        name="sponsor",
        sampler_type=SamplerType.PERSON,
        params=PersonSamplerParams(locale="en_US"),
    )
)

### Creating Trial Information

Next, we'll create the basic trial information:

- Study ID (unique identifier)
- Trial phase and therapeutic area
- Study design details
- Start and end dates for the trial


In [None]:
# Study identifiers
config_builder.add_column(
    SamplerColumnConfig(
        name="study_id",
        sampler_type=SamplerType.UUID,
        params=UUIDSamplerParams(prefix="CT-", short_form=True, uppercase=True),
    )
)

# Trial phase
config_builder.add_column(
    SamplerColumnConfig(
        name="trial_phase",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(
            values=["Phase I", "Phase II", "Phase III", "Phase IV"],
            weights=[0.2, 0.3, 0.4, 0.1],
        ),
    )
)

# Therapeutic area
config_builder.add_column(
    SamplerColumnConfig(
        name="therapeutic_area",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(
            values=[
                "Oncology",
                "Cardiology",
                "Neurology",
                "Immunology",
                "Infectious Disease",
            ],
            weights=[0.3, 0.2, 0.2, 0.15, 0.15],
        ),
    )
)

# Study design
config_builder.add_column(
    SamplerColumnConfig(
        name="study_design",
        sampler_type=SamplerType.SUBCATEGORY,
        params=SubcategorySamplerParams(
            category="trial_phase",
            values={
                "Phase I": [
                    "Single Arm",
                    "Dose Escalation",
                    "First-in-Human",
                    "Safety Assessment",
                ],
                "Phase II": [
                    "Randomized",
                    "Double-Blind",
                    "Proof of Concept",
                    "Open-Label Extension",
                ],
                "Phase III": [
                    "Randomized Controlled",
                    "Double-Blind Placebo-Controlled",
                    "Multi-Center",
                    "Pivotal",
                ],
                "Phase IV": [
                    "Post-Marketing Surveillance",
                    "Real-World Evidence",
                    "Long-Term Safety",
                    "Expanded Access",
                ],
            },
        ),
    )
)


# Trial dates
config_builder.add_column(
    name="trial_start_date",
    column_type="sampler",
    sampler_type="datetime",
    params={"start": "2022-01-01", "end": "2023-06-30"},
    convert_to="%Y-%m-%d",
)

config_builder.add_column(
    name="trial_end_date",
    column_type="sampler",
    sampler_type="datetime",
    params={"start": "2023-07-01", "end": "2024-12-31"},
    convert_to="%Y-%m-%d",
)

### Participant Information

Now we'll create fields for participant demographics and enrollment details:

- Participant ID and basic information
- Demographics (age, gender, etc.)
- Enrollment status and dates
- Randomization assignment


In [None]:
# Participant identifiers and information
config_builder.add_column(
    SamplerColumnConfig(
        name="participant_id",
        sampler_type=SamplerType.UUID,
        params={"prefix": "PT-", "short_form": True, "uppercase": True},
    )
)

config_builder.add_column(
    name="participant_first_name",
    column_type="expression",
    expr="{{participant.first_name}}",
)

config_builder.add_column(
    name="participant_last_name",
    column_type="expression",
    expr="{{participant.last_name}}",
)

config_builder.add_column(
    name="participant_birth_date",
    column_type="expression",
    expr="{{participant.birth_date}}",
)

config_builder.add_column(
    name="participant_email",
    column_type="expression",
    expr="{{participant.email_address}}",
)

# Enrollment information
config_builder.add_column(
    name="enrollment_date",
    column_type="sampler",
    sampler_type="timedelta",
    params={
        "dt_min": 0,
        "dt_max": 60,
        "reference_column_name": "trial_start_date",
        "unit": "D",
    },
    convert_to="%Y-%m-%d",
)

config_builder.add_column(
    SamplerColumnConfig(
        name="participant_status",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(
            values=["Active", "Completed", "Withdrawn", "Lost to Follow-up"],
            weights=[0.6, 0.2, 0.15, 0.05],
        ),
    )
)

config_builder.add_column(
    SamplerColumnConfig(
        name="treatment_arm",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(
            values=["Treatment", "Placebo", "Standard of Care"], weights=[0.5, 0.3, 0.2]
        ),
    )
)


### Investigator and Staff Information

Here we'll add information about the trial staff:

- Investigator information (principal investigator)
- Study coordinator details
- Site information


In [None]:
# Investigator information
config_builder.add_column(
    name="investigator_first_name",
    column_type="expression",
    expr="{{investigator.first_name}}",
)

config_builder.add_column(
    name="investigator_last_name",
    column_type="expression",
    expr="{{investigator.last_name}}",
)

config_builder.add_column(
    SamplerColumnConfig(
        name="investigator_id",
        sampler_type=SamplerType.UUID,
        params={"prefix": "INV-", "short_form": True, "uppercase": True},
    )
)

# Study coordinator information
config_builder.add_column(
    name="coordinator_first_name",
    column_type="expression",
    expr="{{coordinator.first_name}}",
)

config_builder.add_column(
    name="coordinator_last_name",
    column_type="expression",
    expr="{{coordinator.last_name}}",
)

config_builder.add_column(
    name="coordinator_email",
    column_type="expression",
    expr="{{coordinator.email_address}}",
)

# Site information
config_builder.add_column(
    SamplerColumnConfig(
        name="site_id",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(
            values=["SITE-001", "SITE-002", "SITE-003", "SITE-004", "SITE-005"]
        ),
    )
)

config_builder.add_column(
    SamplerColumnConfig(
        name="site_location",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(
            values=["London", "Manchester", "Birmingham", "Edinburgh", "Cambridge"]
        ),
    )
)

# Study costs
config_builder.add_column(
    SamplerColumnConfig(
        name="per_patient_cost",
        sampler_type=SamplerType.GAUSSIAN,
        params=GaussianSamplerParams(mean=15000, stddev=5000),
    )
)

config_builder.add_column(
    SamplerColumnConfig(
        name="participant_compensation",
        sampler_type=SamplerType.GAUSSIAN,
        params=GaussianSamplerParams(mean=500, stddev=200),
    )
)


### Clinical Measurements and Outcomes

These columns will track the key clinical data collected during the trial:

- Vital signs and lab values
- Efficacy measurements
- Dosing information


In [None]:
# Basic clinical measurements
config_builder.add_column(
    SamplerColumnConfig(
        name="baseline_measurement",
        sampler_type=SamplerType.GAUSSIAN,
        params=GaussianSamplerParams(mean=100, stddev=15),
        convert_to="float",
    )
)

config_builder.add_column(
    SamplerColumnConfig(
        name="final_measurement",
        sampler_type=SamplerType.GAUSSIAN,
        params=GaussianSamplerParams(mean=85, stddev=20),
        convert_to="float",
    )
)

# Calculate percent change
config_builder.add_column(
    name="percent_change",
    column_type="expression",
    expr="{{(final_measurement - baseline_measurement) / baseline_measurement * 100}}",
)

# Dosing information
config_builder.add_column(
    SamplerColumnConfig(
        name="dose_level",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(
            values=["Low", "Medium", "High", "Placebo"], weights=[0.3, 0.3, 0.2, 0.2]
        ),
    )
)

config_builder.add_column(
    SamplerColumnConfig(
        name="dose_frequency",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(
            values=["Once daily", "Twice daily", "Weekly", "Biweekly"],
            weights=[0.4, 0.3, 0.2, 0.1],
        ),
    )
)

# Protocol compliance
config_builder.add_column(
    SamplerColumnConfig(
        name="compliance_rate",
        sampler_type=SamplerType.UNIFORM,
        params=UniformSamplerParams(low=0.7, high=1.0),
    )
)

### Adverse Events Tracking

Here we'll capture adverse events that occur during the clinical trial:

- Adverse event presence and type
- Severity and relatedness to treatment
- Dates and resolution


In [None]:
# Adverse event flags and details
config_builder.add_column(
    SamplerColumnConfig(
        name="has_adverse_event",
        sampler_type=SamplerType.BERNOULLI,
        params=BernoulliSamplerParams(p=0.3),
    )
)

config_builder.add_column(
    SamplerColumnConfig(
        name="adverse_event_type",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(
            values=[
                "Headache",
                "Nausea",
                "Fatigue",
                "Rash",
                "Dizziness",
                "Pain at injection site",
                "Other",
            ],
            weights=[0.2, 0.15, 0.15, 0.1, 0.1, 0.2, 0.1],
        ),
        conditional_params={
            "has_adverse_event == 0": CategorySamplerParams(values=["None"])
        },
    )
)

config_builder.add_column(
    SamplerColumnConfig(
        name="adverse_event_severity",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(
            values=["Mild", "Moderate", "Severe", "Life-threatening"]
        ),
        conditional_params={
            "has_adverse_event == 0": CategorySamplerParams(values=["NA"])
        },
    )
)

config_builder.add_column(
    SamplerColumnConfig(
        name="adverse_event_relatedness",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(
            values=[
                "Unrelated",
                "Possibly related",
                "Probably related",
                "Definitely related",
            ],
            weights=[0.2, 0.4, 0.3, 0.1],
        ),
        conditional_params={
            "has_adverse_event == 0": CategorySamplerParams(values=["NA"])
        },
    )
)

config_builder.add_column(
    SamplerColumnConfig(
        name="adverse_event_resolved",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(values=["NA"]),
        conditional_params={
            "has_adverse_event == 1": CategorySamplerParams(
                values=["Yes", "No"], weights=[0.8, 0.2]
            )
        },
    )
)


### Narrative text fields with style variations

These fields will contain natural language text that incorporates PII elements.
We'll use style seed categories to ensure diversity in the writing styles:

1. Medical observations and notes
2. Adverse event descriptions
3. Protocol deviation explanations

**Note**: At this time, we only support using a single file as the seed. If you have multiple files you would like to use as seeds, it is recommended you consolidated these into a single file.


In [None]:
# Documentation style category
config_builder.add_column(
    SamplerColumnConfig(
        name="documentation_style",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(
            values=[
                "Formal and Technical",
                "Concise and Direct",
                "Detailed and Descriptive",
            ],
            weights=[0.4, 0.3, 0.3],
        ),
    )
)

# Medical observations - varies based on documentation style
config_builder.add_column(
    LLMTextColumnConfig(
        name="medical_observations",
        system_prompt=SYSTEM_PROMPT,
        model_alias=MODEL_ALIAS,
        prompt=(
            "{% if documentation_style == 'Formal and Technical' %}\n"
            "Write formal and technical medical observations for participant {{ participant_first_name }} {{ participant_last_name }}\n"
            "(ID: {{ participant_id }}) in the clinical trial for {{ therapeutic_area }} (Study ID: {{ study_id }}).\n"
            "Include observations related to their enrollment in the {{ dose_level }} dose group with {{ dose_frequency }} administration.\n"
            "Baseline measurement was {{ baseline_measurement }} and final measurement was {{ final_measurement }}, representing a"
            "change of {{ percent_change }}%.\n"
            "Use proper medical terminology, maintain a highly formal tone, and structure the notes in a technical format with appropriate"
            "sections and subsections. Include at least one reference to the site investigator, Dr. {{ investigator_last_name }}.\n"
            "{% elif documentation_style == 'Concise and Direct' %}"
            "Write brief, direct medical observations for patient {{ participant_first_name }} {{ participant_last_name }}\n"
            "({{ participant_id }}) in {{ therapeutic_area }} trial {{ study_id }}.\n"
            "Note: {{ dose_level }} dose, {{ dose_frequency }}. Baseline: {{ baseline_measurement }}. Final: {{ final_measurement }}.\n"
            "Change: {{ percent_change }}%.\n"
            "Keep notes extremely concise, using abbreviations where appropriate. Mention follow-up needs and reference\n"
            "Dr. {{ investigator_last_name }} briefly.\n"
            "{% else %}\n"
            "Write detailed and descriptive medical observations for participant {{ participant_first_name }} {{ participant_last_name }}\n"
            "enrolled in the {{ therapeutic_area }} clinical trial ({{ study_id }}).\n"
            "Provide a narrative description of their experience in the {{ dose_level }} dose group with {{ dose_frequency }} dosing.\n"
            "Describe how their measurements changed from baseline ({{ baseline_measurement }}) to final ({{ final_measurement }}),\n"
            "representing a {{ percent_change }}% change.\n"
            "Use a mix of technical terms and explanatory language. Include thorough descriptions of observed effects and subjective "
            "patient reports. Mention interactions with the investigator, Dr. {{ investigator_first_name }} {{ investigator_last_name }}.\n"
            "{% endif %}"
        ),
    )
)

# Adverse event descriptions - conditional on having an adverse event
config_builder.add_column(
    LLMTextColumnConfig(
        name="adverse_event_description",
        system_prompt=SYSTEM_PROMPT,
        model_alias=MODEL_ALIAS,
        prompt=(
            "{% if has_adverse_event == 1 %}"
            "[INSTRUCTIONS: Write a brief clinical description (1-2 sentences only) of the adverse event. "
            "Use formal medical language. Do not include meta-commentary or explain what you're doing.] "
            "{{adverse_event_type}}, {{adverse_event_severity}}. {{adverse_event_relatedness}} to study treatment.\n"
            "{% if adverse_event_resolved == 'Yes' %}Resolved.{% else %}Ongoing.{% endif %}\n"
            "{% else %}\n"
            "[INSTRUCTIONS: Output only the exact text 'No adverse events reported' without any additional commentary.] "
            "No adverse events reported.\n"
            "{% endif %}"
        ),
    )
)

# Protocol deviation description (if compliance is low)
config_builder.add_column(
    LLMTextColumnConfig(
        name="protocol_deviation",
        system_prompt=SYSTEM_PROMPT,
        model_alias=MODEL_ALIAS,
        prompt=(
            "{% if compliance_rate < 0.85 %}"
            "{% if documentation_style == 'Formal and Technical' %}"
            "[FORMAT INSTRUCTIONS: Write in a direct documentation style. Do not use phrases like 'it looks like' or "
            "'you've provided'. Begin with the protocol deviation details. Use formal terminology.]\n"
            "PROTOCOL DEVIATION REPORT\n"
            "Study ID: {{ study_id }}\n"
            "Participant: {{ participant_first_name }} {{ participant_last_name }} ({{ participant_id }})\n"
            "Compliance Rate: {{ compliance_rate }}\n"
            "[Continue with formal description of the deviation, impact on data integrity, and corrective actions. "
            "Reference coordinator {{ coordinator_first_name }} {{ coordinator_last_name }} and Dr. {{ investigator_last_name }}]\n"
            "{% elif documentation_style == 'Concise and Direct' %}\n"
            "[FORMAT INSTRUCTIONS: Use only brief notes and bullet points. No introductions or explanations.]\n"
            "PROTOCOL DEVIATION - {{ participant_id }}\n"
            "• Compliance: {{ compliance_rate }}\n"
            "• Impact: [severity level]\n"
            "• Actions: [list actions]\n"
            "• Coordinator: {{ coordinator_first_name }} {{ coordinator_last_name }}\n"
            "• PI: Dr. {{ investigator_last_name }}\n"
            "{% else %}\n"
            "[FORMAT INSTRUCTIONS: Write a narrative description. Begin directly with the deviation details. No meta-commentary.]\n"
            "During the {{ therapeutic_area }} study at {{ site_location }}, participant {{ participant_first_name }} "
            "{{ participant_last_name }} demonstrated a compliance rate of {{ compliance_rate }}, which constitutes a protocol deviation.\n"
            "[Continue with narrative about circumstances, discovery, impact, and team response. Include references to "
            "{{ coordinator_first_name }} {{ coordinator_last_name }} and Dr. {{ investigator_first_name }} {{ investigator_last_name }}]\n"
            "{% endif %}]\n"
            "{% else %}\n"
            "[FORMAT INSTRUCTIONS: Write a simple direct statement. No meta-commentary or explanation.]\n"
            "PROTOCOL COMPLIANCE ASSESSMENT\n"
            "Participant: {{ participant_first_name }} {{ participant_last_name }} ({{ participant_id }})\n"
            "Finding: No protocol deviations. Compliance rate: {{ compliance_rate }}.\n"
            "{% endif %}"
        ),
    )
)


### Adding Constraints

Finally, we'll add constraints to ensure our data is logically consistent:

- Trial dates must be in proper sequence
- Adverse event dates must occur after enrollment
- Measurement changes must be realistic


In [None]:
# Ensure appropriate date sequence
config_builder.add_constraint(
    target_column="trial_end_date",
    constraint_type="column_inequality",
    operator="gt",
    rhs="trial_start_date",
)

config_builder.add_constraint(
    target_column="enrollment_date",
    constraint_type="column_inequality",
    operator="ge",
    rhs="trial_start_date",
)

config_builder.add_constraint(
    target_column="enrollment_date",
    constraint_type="column_inequality",
    operator="lt",
    rhs="trial_end_date",
)

# Ensure reasonable clinical measurements
config_builder.add_constraint(
    target_column="baseline_measurement",
    constraint_type="scalar_inequality",
    operator="gt",
    rhs=0,
)

config_builder.add_constraint(
    target_column="final_measurement",
    constraint_type="scalar_inequality",
    operator="gt",
    rhs=0,
)

config_builder.add_constraint(
    target_column="trial_end_date",
    constraint_type="column_inequality",
    operator="gt",
    rhs="trial_start_date",
)

config_builder.add_constraint(
    target_column="enrollment_date",
    constraint_type="column_inequality",
    operator="ge",
    rhs="trial_start_date",
)

config_builder.add_constraint(
    target_column="enrollment_date",
    constraint_type="column_inequality",
    operator="lt",
    rhs="trial_end_date",
)

# Ensure reasonable clinical measurements
config_builder.add_constraint(
    target_column="baseline_measurement",
    constraint_type="scalar_inequality",
    operator="gt",
    rhs=0,
)

config_builder.add_constraint(
    target_column="final_measurement",
    constraint_type="scalar_inequality",
    operator="gt",
    rhs=0,
)

### 🔁 Iteration is key – preview the dataset!

1. Use the `preview` method to generate a sample of records quickly.

2. Inspect the results for quality and format issues.

3. Adjust column configurations, prompts, or parameters as needed.

4. Re-run the preview until satisfied.


In [None]:
# Preview a few records
preview = data_designer_client.preview(config_builder)

In [None]:
# More previews
preview.display_sample_record()

### 📊 Analyze the generated data

- Data Designer automatically generates a basic statistical analysis of the generated data.

- This analysis is available via the `analysis` property of generation result objects.


In [None]:
# Print the analysis as a table.
preview.analysis.to_report()

### 🆙 Scale up!

- Happy with your preview data?

- Use the `create` method to submit larger Data Designer generation jobs.


In [None]:
job_results = data_designer_client.create(config_builder, num_records=20)

# This will block until the job is complete.
job_results.wait_until_done()

In [None]:
# Load the generated dataset as a pandas DataFrame.
dataset = job_results.load_dataset()

dataset.head()

In [None]:
# Load the analysis results into memory.
analysis = job_results.load_analysis()

analysis.to_report()

In [None]:
TUTORIAL_OUTPUT_PATH = "data-designer-tutorial-output"

# Download the job artifacts and save them to disk.
job_results.download_artifacts(
    output_path=TUTORIAL_OUTPUT_PATH,
    artifacts_folder_name="artifacts-community-contributions-healthcare-datasets-clinical-trials",
);