# Ingest dummy data

In this notebook, we will show how to create test data in Bronze layer of the data lake, and how to update it with new rows.

In [0]:
from delta import DeltaTable
table_name = "Notes"

## Creation of the initial table

First, create a table and insert 10 rows into it. 

In this example, there are 4 columns in the table: NoteID, NoteText, UserID, AppointmentDate.
- NoteID is a primary key in this column and a column that will be used to determine change in the table.
- NoteText stores patient notes (in this example they are of course synthetic). Notes can be large. In the pipeline, the text in the notes will be pseudonymised, and additional features of interest will be extracted.
- UserID is an identifier of a patient that will need to be pseudonymised in the pipeline.
- AppointmentDate is information about patients appointment and will also need to be pseudonymised to avoid being able to link the data back to the patient.

In [0]:
df = spark.createDataFrame(
    [
        (
            1,
            (
                "Jonathan appeared agitated during today's session, reporting heightened"
                " irritability and difficulty focusing at work. He shared concerns about"
                " persistent insomnia and a sense of impending doom. Jonathan is currently"
                " prescribed lorazepam (1mg as needed) for anxiety management, and we"
                " discussed incorporating relaxation techniques into his daily routine. A"
                " follow-up session is scheduled for November 19, 2023."
            ),
            483215,
            "2023-11-05",
        ),
        (
            2,
            (
                "Olivia conveyed a persistent low mood and feelings of guilt related to a"
                " recent personal loss. She described disruptions in her sleep pattern and"
                " appetite. Olivia is not currently taking any medications. We explored grief"
                " coping strategies and established a plan for ongoing support. Next session:"
                " November 22, 2023."
            ),
            176824,
            "2023-11-08",
        ),
        (
            3,
            (
                "Michael shared concerns about intrusive thoughts and compulsive behaviors"
                " indicative of obsessive-compulsive disorder. He is currently prescribed"
                " fluvoxamine (100mg daily). We discussed cognitive-behavioral strategies to"
                " manage obsessive thoughts. A follow-up is scheduled for November 24, 2023."
            ),
            742309,
            "2023-11-10",
        ),
        (
            4,
            (
                "Jasmine expressed feelings of overwhelming sadness and loss of interest in"
                " activities she once enjoyed. She is prescribed escitalopram (10mg daily) for"
                " depression. We discussed the importance of self-care and scheduled a"
                " follow-up for November 29, 2023."
            ),
            589124,
            "2023-11-15",
        ),
        (
            5,
            (
                "Lucas described acute anxiety related to social situations, impacting his"
                " daily life. He is currently taking sertraline (50mg daily). We explored"
                " exposure therapy techniques and set goals for gradual desensitization. The"
                " next session is scheduled for December 2, 2023."
            ),
            317468,
            "2023-11-18",
        ),
        (
            6,
            (
                "Zoe reported heightened stress levels due to academic pressures and"
                " challenges with time management. She is not currently on medication. We"
                " discussed stress reduction techniques and established strategies for"
                " improved work-life balance. Follow-up session: December 6, 2023."
            ),
            864502,
            "2023-11-22",
        ),
        (
            7,
            (
                "Ryan expressed symptoms of attention deficit hyperactivity disorder (ADHD),"
                " including difficulty sustaining attention and impulsivity. He is prescribed"
                " methylphenidate (20mg daily). We discussed behavioral strategies to manage"
                " ADHD symptoms. Next session: December 9, 2023."
            ),
            125739,
            "2023-11-25",
        ),
        (
            8,
            (
                "Ava shared concerns about recurrent panic attacks, particularly in crowded"
                " spaces. She is prescribed clonazepam (0.5mg as needed). We discussed"
                " breathing exercises and exposure therapy. Follow-up scheduled for December"
                " 13, 2023"
            ),
            650821,
            "2023-11-29",
        ),
        (
            9,
            (
                "Elijah reported persistent feelings of emptiness and identity disturbance. He"
                " is prescribed aripiprazole (5mg daily). We discussed the importance of mood"
                " tracking and established goals for emotional regulation. Next session:"
                " December 16, 2023."
            ),
            294617,
            "2023-12-02",
        ),
        (
            10,
            (
                "Sophia discussed challenges with impulse control and emotional dysregulation."
                " She is currently prescribed lamotrigine (50mg daily). We explored"
                " dialectical behavior therapy (DBT) skills to enhance emotion regulation."
                " Follow-up scheduled for December 19, 2023."
            ),
            817403,
            "2023-12-05",
        ),
    ],
    ["NoteID", "NoteText", "UserID", "AppointmentDate"],
)

In [0]:
df.write.format("delta").mode("overwrite").save(f"abfss://bronze@{spark.conf.get('spark.secret.datalake-uri')}/{table_name}")

Reading back the table that's been created and checking it looks as expected

In [0]:
def read_table(table_name: str, layer: str):
    datalake_uri = spark.conf.get('spark.secret.datalake-uri')
    path = f"abfss://{layer}@{datalake_uri}/{table_name}"
    return spark.read.format("delta").load(path)

In [0]:
df = read_table(table_name, "bronze")
display(df)

NoteID,NoteText,UserID,AppointmentDate
1,"Jonathan appeared agitated during today's session, reporting heightened irritability and difficulty focusing at work. He shared concerns about persistent insomnia and a sense of impending doom. Jonathan is currently prescribed lorazepam (1mg as needed) for anxiety management, and we discussed incorporating relaxation techniques into his daily routine. A follow-up session is scheduled for November 19, 2023.",483215,2023-11-05
2,"Olivia conveyed a persistent low mood and feelings of guilt related to a recent personal loss. She described disruptions in her sleep pattern and appetite. Olivia is not currently taking any medications. We explored grief coping strategies and established a plan for ongoing support. Next session: November 22, 2023.",176824,2023-11-08
4,"Jasmine expressed feelings of overwhelming sadness and loss of interest in activities she once enjoyed. She is prescribed escitalopram (10mg daily) for depression. We discussed the importance of self-care and scheduled a follow-up for November 29, 2023.",589124,2023-11-15
5,"Lucas described acute anxiety related to social situations, impacting his daily life. He is currently taking sertraline (50mg daily). We explored exposure therapy techniques and set goals for gradual desensitization. The next session is scheduled for December 2, 2023.",317468,2023-11-18
9,"Elijah reported persistent feelings of emptiness and identity disturbance. He is prescribed aripiprazole (5mg daily). We discussed the importance of mood tracking and established goals for emotional regulation. Next session: December 16, 2023.",294617,2023-12-02
10,"Sophia discussed challenges with impulse control and emotional dysregulation. She is currently prescribed lamotrigine (50mg daily). We explored dialectical behavior therapy (DBT) skills to enhance emotion regulation. Follow-up scheduled for December 19, 2023.",817403,2023-12-05
3,"Michael shared concerns about intrusive thoughts and compulsive behaviors indicative of obsessive-compulsive disorder. He is currently prescribed fluvoxamine (100mg daily). We discussed cognitive-behavioral strategies to manage obsessive thoughts. A follow-up is scheduled for November 24, 2023.",742309,2023-11-10
6,"Zoe reported heightened stress levels due to academic pressures and challenges with time management. She is not currently on medication. We discussed stress reduction techniques and established strategies for improved work-life balance. Follow-up session: December 6, 2023.",864502,2023-11-22
7,"Ryan expressed symptoms of attention deficit hyperactivity disorder (ADHD), including difficulty sustaining attention and impulsivity. He is prescribed methylphenidate (20mg daily). We discussed behavioral strategies to manage ADHD symptoms. Next session: December 9, 2023.",125739,2023-11-25
8,"Ava shared concerns about recurrent panic attacks, particularly in crowded spaces. She is prescribed clonazepam (0.5mg as needed). We discussed breathing exercises and exposure therapy. Follow-up scheduled for December 13, 2023",650821,2023-11-29


## Checking the result of the pipeline

After running the previous cells, head to ADF instance of your resource group (it will have a name like `adf-${flowehr_id}-dev)` and trigger the PatientsPipeline (click on `Add Trigger` - `Trigger Now`).

After it runs successfully, run the code below to check the result in the Gold layer of the Data Lake.

In [0]:
df = read_table(table_name, "silver")
display(df)

NoteID,NoteText,AppointmentDate
3,shared concerns about intrusive thoughts and compulsive behaviors indicative of obsessive-compulsive disorder. He is currently prescribed fluvoxamine (100mg ). We discussed cognitive-behavioral strategies to manage obsessive thoughts. A follow-up is scheduled for .,2023-11-10T00:00:00Z
5,"described acute anxiety related to social situations, impacting his life. He is currently taking sertraline (50mg ). We explored exposure therapy techniques and set goals for gradual desensitization. The next session is scheduled for .",2023-11-18T00:00:00Z
7,"expressed symptoms of attention deficit hyperactivity disorder (ADHD), including difficulty sustaining attention and impulsivity. He is prescribed methylphenidate (20mg ). We discussed behavioral strategies to manage ADHD symptoms. Next session: .",2023-11-25T00:00:00Z
6,reported heightened stress levels due to academic pressures and challenges with time management. She is not currently on medication. We discussed stress reduction techniques and established strategies for improved work-life balance. Follow-up session: .,2023-11-22T00:00:00Z
10,discussed challenges with impulse control and emotional dysregulation. She is currently prescribed lamotrigine (50mg ). We explored dialectical behavior therapy (DBT) skills to enhance emotion regulation. Follow-up scheduled for .,2023-12-05T00:00:00Z
1,"appeared agitated during 's session, reporting heightened irritability and difficulty focusing at work. He shared concerns about persistent insomnia and a sense of impending doom. is currently prescribed lorazepam (1mg as needed) for anxiety management, and we discussed incorporating relaxation techniques into his routine. A follow-up session is scheduled for .",2023-11-05T00:00:00Z
4,expressed feelings of overwhelming sadness and loss of interest in activities she once enjoyed. She is prescribed escitalopram (10mg ) for depression. We discussed the importance of self-care and scheduled a follow-up for .,2023-11-15T00:00:00Z
8,"shared concerns about recurrent panic attacks, particularly in crowded spaces. She is prescribed clonazepam (0.5mg as needed). We discussed breathing exercises and exposure therapy. Follow-up scheduled for ,",2023-11-29T00:00:00Z
2,conveyed a persistent low mood and feelings of guilt related to a recent personal loss. She described disruptions in her sleep pattern and appetite. is not currently taking any medications. We explored grief coping strategies and established a plan for ongoing support. Next session: .,2023-11-08T00:00:00Z
9,reported persistent feelings of emptiness and identity disturbance. He is prescribed aripiprazole (5mg ). We discussed the importance of mood tracking and established goals for emotional regulation. Next session: .,2023-12-02T00:00:00Z


In [0]:
df = read_table(table_name, "gold")
display(df)

NoteID,NoteText,AppointmentDate,AnalyzeHealthText_bcca9d806010_error,NoteText_extracted
10,discussed challenges with impulse control and emotional dysregulation. She is currently prescribed lamotrigine (50mg ). We explored dialectical behavior therapy (DBT) skills to enhance emotion regulation. Follow-up scheduled for .,2023-12-05T00:00:00Z,,"List(null, List(0, List(List(108, 11, lamotrigine, MedicationName, 1.0), List(121, 4, 50mg, Dosage, 1.0), List(152, 28, dialectical behavior therapy, TreatmentName, 0.97), List(182, 3, DBT, TreatmentName, 0.97), List(205, 18, emotion regulation, TreatmentName, 0.72), List(225, 9, Follow-up, AdministrativeEvent, 0.94)), List(List(DosageOfMedication, List(List(#/results/documents/0/entities/0, Medication), List(#/results/documents/0/entities/1, Dosage))), List(Abbreviation, List(List(#/results/documents/0/entities/2, FullTerm), List(#/results/documents/0/entities/3, AbbreviatedTerm)))), List(), null), null, 2023-04-01)"
1,"appeared agitated during 's session, reporting heightened irritability and difficulty focusing at work. He shared concerns about persistent insomnia and a sense of impending doom. is currently prescribed lorazepam (1mg as needed) for anxiety management, and we discussed incorporating relaxation techniques into his routine. A follow-up session is scheduled for .",2023-11-05T00:00:00Z,,"List(null, List(1, List(List(18, 8, agitated, SymptomOrSign, 0.97), List(67, 10, heightened, ConditionQualifier, 0.95), List(78, 12, irritability, SymptomOrSign, 0.98), List(95, 27, difficulty focusing at work, SymptomOrSign, 0.88), List(149, 19, persistent insomnia, Diagnosis, 1.0), List(233, 9, lorazepam, MedicationName, 1.0), List(244, 3, 1mg, Dosage, 1.0), List(248, 9, as needed, Frequency, 0.99), List(263, 18, anxiety management, TreatmentName, 0.85), List(314, 21, relaxation techniques, TreatmentName, 0.93), List(368, 17, follow-up session, AdministrativeEvent, 0.93)), List(List(QualifierOfCondition, List(List(#/results/documents/1/entities/1, Qualifier), List(#/results/documents/1/entities/2, Condition))), List(QualifierOfCondition, List(List(#/results/documents/1/entities/1, Qualifier), List(#/results/documents/1/entities/3, Condition))), List(DosageOfMedication, List(List(#/results/documents/1/entities/5, Medication), List(#/results/documents/1/entities/6, Dosage))), List(FrequencyOfMedication, List(List(#/results/documents/1/entities/5, Medication), List(#/results/documents/1/entities/7, Frequency)))), List(), null), null, 2023-04-01)"
4,expressed feelings of overwhelming sadness and loss of interest in activities she once enjoyed. She is prescribed escitalopram (10mg ) for depression. We discussed the importance of self-care and scheduled a follow-up for .,2023-11-15T00:00:00Z,,"List(null, List(2, List(List(31, 12, overwhelming, ConditionQualifier, 0.65), List(44, 7, sadness, SymptomOrSign, 0.86), List(56, 4, loss, SymptomOrSign, 0.72), List(123, 12, escitalopram, MedicationName, 1.0), List(137, 4, 10mg, Dosage, 1.0), List(159, 10, depression, Diagnosis, 0.97), List(228, 9, follow-up, AdministrativeEvent, 0.92)), List(List(QualifierOfCondition, List(List(#/results/documents/2/entities/0, Qualifier), List(#/results/documents/2/entities/1, Condition))), List(QualifierOfCondition, List(List(#/results/documents/2/entities/0, Qualifier), List(#/results/documents/2/entities/2, Condition))), List(DosageOfMedication, List(List(#/results/documents/2/entities/3, Medication), List(#/results/documents/2/entities/4, Dosage)))), List(), null), null, 2023-04-01)"
8,"shared concerns about recurrent panic attacks, particularly in crowded spaces. She is prescribed clonazepam (0.5mg as needed). We discussed breathing exercises and exposure therapy. Follow-up scheduled for ,",2023-11-29T00:00:00Z,,"List(null, List(0, List(List(31, 9, recurrent, Course, 0.97), List(41, 13, panic attacks, Diagnosis, 1.0), List(106, 10, clonazepam, MedicationName, 1.0), List(118, 5, 0.5mg, Dosage, 1.0), List(124, 9, as needed, Frequency, 0.97), List(149, 19, breathing exercises, TreatmentName, 0.99), List(173, 16, exposure therapy, TreatmentName, 1.0), List(191, 9, Follow-up, AdministrativeEvent, 0.94)), List(List(CourseOfCondition, List(List(#/results/documents/0/entities/0, Course), List(#/results/documents/0/entities/1, Condition))), List(DosageOfMedication, List(List(#/results/documents/0/entities/2, Medication), List(#/results/documents/0/entities/3, Dosage))), List(FrequencyOfMedication, List(List(#/results/documents/0/entities/2, Medication), List(#/results/documents/0/entities/4, Frequency)))), List(), null), null, 2023-04-01)"
2,conveyed a persistent low mood and feelings of guilt related to a recent personal loss. She described disruptions in her sleep pattern and appetite. is not currently taking any medications. We explored grief coping strategies and established a plan for ongoing support. Next session: .,2023-11-08T00:00:00Z,,"List(null, List(1, List(List(20, 10, persistent, Course, 0.64), List(31, 8, low mood, SymptomOrSign, 0.98), List(44, 17, feelings of guilt, SymptomOrSign, 0.71), List(111, 32, disruptions in her sleep pattern, SymptomOrSign, 0.99), List(148, 8, appetite, SymptomOrSign, 1.0), List(195, 11, medications, TreatmentName, 0.97), List(220, 23, grief coping strategies, TreatmentName, 0.66)), List(List(CourseOfCondition, List(List(#/results/documents/1/entities/0, Course), List(#/results/documents/1/entities/1, Condition))), List(CourseOfCondition, List(List(#/results/documents/1/entities/0, Course), List(#/results/documents/1/entities/2, Condition)))), List(), null), null, 2023-04-01)"
9,reported persistent feelings of emptiness and identity disturbance. He is prescribed aripiprazole (5mg ). We discussed the importance of mood tracking and established goals for emotional regulation. Next session: .,2023-12-02T00:00:00Z,,"List(null, List(2, List(List(18, 10, persistent, Course, 0.84), List(41, 9, emptiness, SymptomOrSign, 0.9), List(55, 20, identity disturbance, SymptomOrSign, 0.9), List(94, 12, aripiprazole, MedicationName, 1.0), List(108, 3, 5mg, Dosage, 1.0)), List(List(CourseOfCondition, List(List(#/results/documents/2/entities/0, Course), List(#/results/documents/2/entities/1, Condition))), List(CourseOfCondition, List(List(#/results/documents/2/entities/0, Course), List(#/results/documents/2/entities/2, Condition))), List(DosageOfMedication, List(List(#/results/documents/2/entities/3, Medication), List(#/results/documents/2/entities/4, Dosage)))), List(), null), null, 2023-04-01)"
7,"expressed symptoms of attention deficit hyperactivity disorder (ADHD), including difficulty sustaining attention and impulsivity. He is prescribed methylphenidate (20mg ). We discussed behavioral strategies to manage ADHD symptoms. Next session: .",2023-11-25T00:00:00Z,,"List(null, List(0, List(List(19, 8, symptoms, SymptomOrSign, 0.73), List(31, 40, attention deficit hyperactivity disorder, Diagnosis, 1.0), List(73, 4, ADHD, Diagnosis, 1.0), List(90, 31, difficulty sustaining attention, SymptomOrSign, 0.98), List(126, 11, impulsivity, SymptomOrSign, 0.98), List(156, 15, methylphenidate, MedicationName, 1.0), List(173, 4, 20mg, Dosage, 1.0), List(205, 21, behavioral strategies, TreatmentName, 0.68), List(237, 4, ADHD, Diagnosis, 1.0), List(242, 8, symptoms, SymptomOrSign, 0.73)), List(List(Abbreviation, List(List(#/results/documents/0/entities/1, FullTerm), List(#/results/documents/0/entities/2, AbbreviatedTerm))), List(Abbreviation, List(List(#/results/documents/0/entities/1, FullTerm), List(#/results/documents/0/entities/8, AbbreviatedTerm))), List(DosageOfMedication, List(List(#/results/documents/0/entities/5, Medication), List(#/results/documents/0/entities/6, Dosage)))), List(), null), null, 2023-04-01)"
6,reported heightened stress levels due to academic pressures and challenges with time management. She is not currently on medication. We discussed stress reduction techniques and established strategies for improved work-life balance. Follow-up session: .,2023-11-22T00:00:00Z,,"List(null, List(1, List(List(29, 6, stress, SymptomOrSign, 1.0), List(130, 10, medication, TreatmentName, 0.96), List(155, 27, stress reduction techniques, TreatmentName, 0.9), List(214, 8, improved, Course, 0.9), List(242, 17, Follow-up session, AdministrativeEvent, 0.88)), List(), List(), null), null, 2023-04-01)"
3,shared concerns about intrusive thoughts and compulsive behaviors indicative of obsessive-compulsive disorder. He is currently prescribed fluvoxamine (100mg ). We discussed cognitive-behavioral strategies to manage obsessive thoughts. A follow-up is scheduled for .,2023-11-10T00:00:00Z,,"List(null, List(0, List(List(31, 18, intrusive thoughts, SymptomOrSign, 0.89), List(54, 20, compulsive behaviors, SymptomOrSign, 0.82), List(89, 29, obsessive-compulsive disorder, Diagnosis, 1.0), List(147, 11, fluvoxamine, MedicationName, 1.0), List(160, 5, 100mg, Dosage, 1.0), List(193, 31, cognitive-behavioral strategies, TreatmentName, 0.86), List(235, 18, obsessive thoughts, Diagnosis, 0.62), List(257, 9, follow-up, AdministrativeEvent, 0.95)), List(List(DosageOfMedication, List(List(#/results/documents/0/entities/3, Medication), List(#/results/documents/0/entities/4, Dosage)))), List(), null), null, 2023-04-01)"
5,"described acute anxiety related to social situations, impacting his life. He is currently taking sertraline (50mg ). We explored exposure therapy techniques and set goals for gradual desensitization. The next session is scheduled for .",2023-11-18T00:00:00Z,,"List(null, List(1, List(List(19, 13, acute anxiety, Diagnosis, 1.0), List(118, 10, sertraline, MedicationName, 1.0), List(130, 4, 50mg, Dosage, 1.0), List(161, 27, exposure therapy techniques, TreatmentName, 0.88), List(207, 7, gradual, Course, 0.63), List(215, 15, desensitization, TreatmentName, 0.99)), List(List(DosageOfMedication, List(List(#/results/documents/1/entities/1, Medication), List(#/results/documents/1/entities/2, Dosage))), List(CourseOfTreatment, List(List(#/results/documents/1/entities/4, Course), List(#/results/documents/1/entities/5, Treatment)))), List(), null), null, 2023-04-01)"


You can also check table history, like done below

In [0]:
layer = "silver"  # Replace with "silver" or "bronze" to check other layers

path = f"abfss://{layer}@{spark.conf.get('spark.secret.datalake-uri')}/{table_name}"
display(DeltaTable.forPath(spark, path).history())

version,timestamp,userId,userName,operation,operationParameters,job,notebook,clusterId,readVersion,isolationLevel,isBlindAppend,operationMetrics,userMetadata,engineInfo
2,2023-11-02T16:05:11Z,3929536096574620,79302ad7-a448-40dd-a0fd-ed776012af4b,MERGE,"Map(predicate -> [""(NoteID#748L = NoteID#320L)""], matchedPredicates -> [], statsOnLoad -> false, notMatchedBySourcePredicates -> [], notMatchedPredicates -> [{""actionType"":""insert""}])","List(481463139130891, ADF_adf-tbdc0-dev_PatientNotesPipeline_PseudonymisationActivity_b3bc7b5c-2f48-4606-8b58-c6b90409be0f, null, 992140386478951, 3929536096574620, manual)",,1025-001910-dtqy2seg,1.0,WriteSerializable,False,"Map(numTargetRowsCopied -> 0, numTargetRowsDeleted -> 0, numTargetFilesAdded -> 1, numTargetBytesAdded -> 2541, numTargetBytesRemoved -> 0, numTargetDeletionVectorsAdded -> 0, numTargetRowsMatchedUpdated -> 0, executionTimeMs -> 38082, numTargetRowsInserted -> 1, numTargetRowsMatchedDeleted -> 0, scanTimeMs -> 0, numTargetRowsUpdated -> 0, numOutputRows -> 1, numTargetDeletionVectorsRemoved -> 0, numTargetRowsNotMatchedBySourceUpdated -> 0, numTargetChangeFilesAdded -> 0, numSourceRows -> 1, numTargetFilesRemoved -> 0, numTargetRowsNotMatchedBySourceDeleted -> 0, rewriteTimeMs -> 38058)",,Databricks-Runtime/13.3.x-scala2.12
1,2023-11-02T16:04:29Z,3929536096574620,79302ad7-a448-40dd-a0fd-ed776012af4b,MERGE,"Map(predicate -> [""(NoteID#748L = NoteID#320L)""], matchedPredicates -> [{""actionType"":""delete""}], statsOnLoad -> false, notMatchedBySourcePredicates -> [], notMatchedPredicates -> [])","List(481463139130891, ADF_adf-tbdc0-dev_PatientNotesPipeline_PseudonymisationActivity_b3bc7b5c-2f48-4606-8b58-c6b90409be0f, null, 992140386478951, 3929536096574620, manual)",,1025-001910-dtqy2seg,0.0,WriteSerializable,False,"Map(numTargetRowsCopied -> 0, numTargetRowsDeleted -> 0, numTargetFilesAdded -> 0, numTargetBytesAdded -> 0, numTargetBytesRemoved -> 0, numTargetDeletionVectorsAdded -> 0, numTargetRowsMatchedUpdated -> 0, executionTimeMs -> 10668, numTargetRowsInserted -> 0, numTargetRowsMatchedDeleted -> 0, scanTimeMs -> 3142, numTargetRowsUpdated -> 0, numOutputRows -> 0, numTargetDeletionVectorsRemoved -> 0, numTargetRowsNotMatchedBySourceUpdated -> 0, numTargetChangeFilesAdded -> 0, numSourceRows -> 0, numTargetFilesRemoved -> 0, numTargetRowsNotMatchedBySourceDeleted -> 0, rewriteTimeMs -> 0)",,Databricks-Runtime/13.3.x-scala2.12
0,2023-10-25T17:35:52Z,3929536096574620,79302ad7-a448-40dd-a0fd-ed776012af4b,WRITE,"Map(mode -> ErrorIfExists, statsOnLoad -> false, partitionBy -> [])","List(792125171179599, ADF_adf-tbdc0-dev_PatientNotesPipeline_PseudonymisationActivity_6c131e32-586b-4c79-aad9-c992d88e2b52, null, 247033079801813, 3929536096574620, manual)",,1025-001910-dtqy2seg,,WriteSerializable,True,"Map(numFiles -> 8, numOutputRows -> 10, numOutputBytes -> 20437)",,Databricks-Runtime/13.3.x-scala2.12


## Inserting an update into the table

Now let's insert a new row to the table, and delete an existing one.

Note that in this pipeline, the updates are not being processed. (If you try to update a row, the pipeline will fail.)

This is because for large volumes of text, it is going to be expensive to determine which rows have updated, if the updated rows have the same primary keys as the rows already existing in the table.

Thus, we are assuming that each table update is either inserting a new Primary key, or removing a primary key.

In [0]:


update_df = spark.createDataFrame(
    [
        (
            11,
            (
                "Mia described symptoms of insomnia and racing thoughts, suggesting generalized anxiety. She is currently not taking any medication. We explored sleep hygiene practices and relaxation techniques. A follow-up appointment is scheduled for December 29, 2023."
            ),
            548290,
            "2023-12-15"
        )
    ],
    ["NoteID", "NoteText", "UserID", "AppointmentDate"]
)

display(update_df)

NoteID,NoteText,UserID,AppointmentDate
11,"Mia described symptoms of insomnia and racing thoughts, suggesting generalized anxiety. She is currently not taking any medication. We explored sleep hygiene practices and relaxation techniques. A follow-up appointment is scheduled for December 29, 2023.",548290,2023-12-15


In [0]:
path = f"abfss://bronze@{spark.conf.get('spark.secret.datalake-uri')}/{table_name}"
delta_table = DeltaTable.forPath(spark, path)
delta_table.alias("target").merge(
    source=update_df.alias("source"),
    condition="source.NoteID = target.NoteID"
).whenNotMatchedInsertAll().execute()

Now you can re-trigger the pipeline again.
After you've done that, you can check in the logs how many rows have been processed in Silver and Gold pipeline.

## Querying data in the Gold store

The final result of the pipeline can be accessed through a Managed table in Unity Catalog.

You can query data through SQL statement in a notebook, like so:

In [0]:
%sql
SELECT * from catalog.schema.Notes order by NoteID asc;

NoteID,NoteText,AppointmentDate,AnalyzeHealthText_bcca9d806010_error,NoteText_extracted
1,"appeared agitated during 's session, reporting heightened irritability and difficulty focusing at work. He shared concerns about persistent insomnia and a sense of impending doom. is currently prescribed lorazepam (1mg as needed) for anxiety management, and we discussed incorporating relaxation techniques into his routine. A follow-up session is scheduled for .",2023-11-05T00:00:00Z,,"List(null, List(1, List(List(18, 8, agitated, SymptomOrSign, 0.97), List(67, 10, heightened, ConditionQualifier, 0.95), List(78, 12, irritability, SymptomOrSign, 0.98), List(95, 27, difficulty focusing at work, SymptomOrSign, 0.88), List(149, 19, persistent insomnia, Diagnosis, 1.0), List(233, 9, lorazepam, MedicationName, 1.0), List(244, 3, 1mg, Dosage, 1.0), List(248, 9, as needed, Frequency, 0.99), List(263, 18, anxiety management, TreatmentName, 0.85), List(314, 21, relaxation techniques, TreatmentName, 0.93), List(368, 17, follow-up session, AdministrativeEvent, 0.93)), List(List(QualifierOfCondition, List(List(#/results/documents/1/entities/1, Qualifier), List(#/results/documents/1/entities/2, Condition))), List(QualifierOfCondition, List(List(#/results/documents/1/entities/1, Qualifier), List(#/results/documents/1/entities/3, Condition))), List(DosageOfMedication, List(List(#/results/documents/1/entities/5, Medication), List(#/results/documents/1/entities/6, Dosage))), List(FrequencyOfMedication, List(List(#/results/documents/1/entities/5, Medication), List(#/results/documents/1/entities/7, Frequency)))), List(), null), null, 2023-04-01)"
2,conveyed a persistent low mood and feelings of guilt related to a recent personal loss. She described disruptions in her sleep pattern and appetite. is not currently taking any medications. We explored grief coping strategies and established a plan for ongoing support. Next session: .,2023-11-08T00:00:00Z,,"List(null, List(1, List(List(20, 10, persistent, Course, 0.64), List(31, 8, low mood, SymptomOrSign, 0.98), List(44, 17, feelings of guilt, SymptomOrSign, 0.71), List(111, 32, disruptions in her sleep pattern, SymptomOrSign, 0.99), List(148, 8, appetite, SymptomOrSign, 1.0), List(195, 11, medications, TreatmentName, 0.97), List(220, 23, grief coping strategies, TreatmentName, 0.66)), List(List(CourseOfCondition, List(List(#/results/documents/1/entities/0, Course), List(#/results/documents/1/entities/1, Condition))), List(CourseOfCondition, List(List(#/results/documents/1/entities/0, Course), List(#/results/documents/1/entities/2, Condition)))), List(), null), null, 2023-04-01)"
3,shared concerns about intrusive thoughts and compulsive behaviors indicative of obsessive-compulsive disorder. He is currently prescribed fluvoxamine (100mg ). We discussed cognitive-behavioral strategies to manage obsessive thoughts. A follow-up is scheduled for .,2023-11-10T00:00:00Z,,"List(null, List(0, List(List(31, 18, intrusive thoughts, SymptomOrSign, 0.89), List(54, 20, compulsive behaviors, SymptomOrSign, 0.82), List(89, 29, obsessive-compulsive disorder, Diagnosis, 1.0), List(147, 11, fluvoxamine, MedicationName, 1.0), List(160, 5, 100mg, Dosage, 1.0), List(193, 31, cognitive-behavioral strategies, TreatmentName, 0.86), List(235, 18, obsessive thoughts, Diagnosis, 0.62), List(257, 9, follow-up, AdministrativeEvent, 0.95)), List(List(DosageOfMedication, List(List(#/results/documents/0/entities/3, Medication), List(#/results/documents/0/entities/4, Dosage)))), List(), null), null, 2023-04-01)"
4,expressed feelings of overwhelming sadness and loss of interest in activities she once enjoyed. She is prescribed escitalopram (10mg ) for depression. We discussed the importance of self-care and scheduled a follow-up for .,2023-11-15T00:00:00Z,,"List(null, List(2, List(List(31, 12, overwhelming, ConditionQualifier, 0.65), List(44, 7, sadness, SymptomOrSign, 0.86), List(56, 4, loss, SymptomOrSign, 0.72), List(123, 12, escitalopram, MedicationName, 1.0), List(137, 4, 10mg, Dosage, 1.0), List(159, 10, depression, Diagnosis, 0.97), List(228, 9, follow-up, AdministrativeEvent, 0.92)), List(List(QualifierOfCondition, List(List(#/results/documents/2/entities/0, Qualifier), List(#/results/documents/2/entities/1, Condition))), List(QualifierOfCondition, List(List(#/results/documents/2/entities/0, Qualifier), List(#/results/documents/2/entities/2, Condition))), List(DosageOfMedication, List(List(#/results/documents/2/entities/3, Medication), List(#/results/documents/2/entities/4, Dosage)))), List(), null), null, 2023-04-01)"
5,"described acute anxiety related to social situations, impacting his life. He is currently taking sertraline (50mg ). We explored exposure therapy techniques and set goals for gradual desensitization. The next session is scheduled for .",2023-11-18T00:00:00Z,,"List(null, List(1, List(List(19, 13, acute anxiety, Diagnosis, 1.0), List(118, 10, sertraline, MedicationName, 1.0), List(130, 4, 50mg, Dosage, 1.0), List(161, 27, exposure therapy techniques, TreatmentName, 0.88), List(207, 7, gradual, Course, 0.63), List(215, 15, desensitization, TreatmentName, 0.99)), List(List(DosageOfMedication, List(List(#/results/documents/1/entities/1, Medication), List(#/results/documents/1/entities/2, Dosage))), List(CourseOfTreatment, List(List(#/results/documents/1/entities/4, Course), List(#/results/documents/1/entities/5, Treatment)))), List(), null), null, 2023-04-01)"
6,reported heightened stress levels due to academic pressures and challenges with time management. She is not currently on medication. We discussed stress reduction techniques and established strategies for improved work-life balance. Follow-up session: .,2023-11-22T00:00:00Z,,"List(null, List(1, List(List(29, 6, stress, SymptomOrSign, 1.0), List(130, 10, medication, TreatmentName, 0.96), List(155, 27, stress reduction techniques, TreatmentName, 0.9), List(214, 8, improved, Course, 0.9), List(242, 17, Follow-up session, AdministrativeEvent, 0.88)), List(), List(), null), null, 2023-04-01)"
7,"expressed symptoms of attention deficit hyperactivity disorder (ADHD), including difficulty sustaining attention and impulsivity. He is prescribed methylphenidate (20mg ). We discussed behavioral strategies to manage ADHD symptoms. Next session: .",2023-11-25T00:00:00Z,,"List(null, List(0, List(List(19, 8, symptoms, SymptomOrSign, 0.73), List(31, 40, attention deficit hyperactivity disorder, Diagnosis, 1.0), List(73, 4, ADHD, Diagnosis, 1.0), List(90, 31, difficulty sustaining attention, SymptomOrSign, 0.98), List(126, 11, impulsivity, SymptomOrSign, 0.98), List(156, 15, methylphenidate, MedicationName, 1.0), List(173, 4, 20mg, Dosage, 1.0), List(205, 21, behavioral strategies, TreatmentName, 0.68), List(237, 4, ADHD, Diagnosis, 1.0), List(242, 8, symptoms, SymptomOrSign, 0.73)), List(List(Abbreviation, List(List(#/results/documents/0/entities/1, FullTerm), List(#/results/documents/0/entities/2, AbbreviatedTerm))), List(Abbreviation, List(List(#/results/documents/0/entities/1, FullTerm), List(#/results/documents/0/entities/8, AbbreviatedTerm))), List(DosageOfMedication, List(List(#/results/documents/0/entities/5, Medication), List(#/results/documents/0/entities/6, Dosage)))), List(), null), null, 2023-04-01)"
8,"shared concerns about recurrent panic attacks, particularly in crowded spaces. She is prescribed clonazepam (0.5mg as needed). We discussed breathing exercises and exposure therapy. Follow-up scheduled for ,",2023-11-29T00:00:00Z,,"List(null, List(0, List(List(31, 9, recurrent, Course, 0.97), List(41, 13, panic attacks, Diagnosis, 1.0), List(106, 10, clonazepam, MedicationName, 1.0), List(118, 5, 0.5mg, Dosage, 1.0), List(124, 9, as needed, Frequency, 0.97), List(149, 19, breathing exercises, TreatmentName, 0.99), List(173, 16, exposure therapy, TreatmentName, 1.0), List(191, 9, Follow-up, AdministrativeEvent, 0.94)), List(List(CourseOfCondition, List(List(#/results/documents/0/entities/0, Course), List(#/results/documents/0/entities/1, Condition))), List(DosageOfMedication, List(List(#/results/documents/0/entities/2, Medication), List(#/results/documents/0/entities/3, Dosage))), List(FrequencyOfMedication, List(List(#/results/documents/0/entities/2, Medication), List(#/results/documents/0/entities/4, Frequency)))), List(), null), null, 2023-04-01)"
9,reported persistent feelings of emptiness and identity disturbance. He is prescribed aripiprazole (5mg ). We discussed the importance of mood tracking and established goals for emotional regulation. Next session: .,2023-12-02T00:00:00Z,,"List(null, List(2, List(List(18, 10, persistent, Course, 0.84), List(41, 9, emptiness, SymptomOrSign, 0.9), List(55, 20, identity disturbance, SymptomOrSign, 0.9), List(94, 12, aripiprazole, MedicationName, 1.0), List(108, 3, 5mg, Dosage, 1.0)), List(List(CourseOfCondition, List(List(#/results/documents/2/entities/0, Course), List(#/results/documents/2/entities/1, Condition))), List(CourseOfCondition, List(List(#/results/documents/2/entities/0, Course), List(#/results/documents/2/entities/2, Condition))), List(DosageOfMedication, List(List(#/results/documents/2/entities/3, Medication), List(#/results/documents/2/entities/4, Dosage)))), List(), null), null, 2023-04-01)"
10,discussed challenges with impulse control and emotional dysregulation. She is currently prescribed lamotrigine (50mg ). We explored dialectical behavior therapy (DBT) skills to enhance emotion regulation. Follow-up scheduled for .,2023-12-05T00:00:00Z,,"List(null, List(0, List(List(108, 11, lamotrigine, MedicationName, 1.0), List(121, 4, 50mg, Dosage, 1.0), List(152, 28, dialectical behavior therapy, TreatmentName, 0.97), List(182, 3, DBT, TreatmentName, 0.97), List(205, 18, emotion regulation, TreatmentName, 0.72), List(225, 9, Follow-up, AdministrativeEvent, 0.94)), List(List(DosageOfMedication, List(List(#/results/documents/0/entities/0, Medication), List(#/results/documents/0/entities/1, Dosage))), List(Abbreviation, List(List(#/results/documents/0/entities/2, FullTerm), List(#/results/documents/0/entities/3, AbbreviatedTerm)))), List(), null), null, 2023-04-01)"
