Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function to get periodPriorToIndex #1

Open
Jay-sanjay opened this issue Dec 30, 2023 · 4 comments
Open

Function to get periodPriorToIndex #1

Jay-sanjay opened this issue Dec 30, 2023 · 4 comments
Assignees
Labels
discussion Needs to be dicussed-designed throughly before implementing

Comments

@Jay-sanjay
Copy link
Member

Shri !!

The first decision that needs to be made is the moment in time from which selected treatments of interest should be included in
the treatment pathway.periodPriorToIndex specifies the period (i.e. number of days) prior to the index date from which
treatments should be included.

@Jay-sanjay Jay-sanjay self-assigned this Dec 30, 2023
@Jay-sanjay Jay-sanjay added the discussion Needs to be dicussed-designed throughly before implementing label Dec 30, 2023
@TheCedarPrince
Copy link
Member

Just copying our discussion from Slack to here so we don't lose the ideas:

what my plan is that , this function will perform a very basic operation to create a new data-frame that has patients after preiodPriorIndex and then this data-frame can then be passed to the next functions for further processing from Pathways Extractions.
Do let me know what your thoughts are on it 😄
~ @Jay-sanjay

I think this approach sounds great!

For the first version of this function, we'll want to be able to pass it a connection object, a list of cohorts to examine (like in OCC), and a date time object. A potential API might look like this:

function period_prior_to_index(cohort_ids::Vector, conn; date_prior::DateTime = 100 Days)

#= 

Code that does neat stuff

=#

#= 

Code that will do something approximately like this:

1. Reads selected patient cohorts from the cohort table
2. Find each patients entry date
3. Apply period prior to each patient entry date  

=#
end

Two more interesting dispatches will be if we do not have defined cohorts. This would be much more tricky to do but I think very useful and interesting. Here is how these dispatches would work:

The first dispatch would only care about the following:

function period_prior_to_index(person_ids::Vector, conn; index_date::Symbol, date_prior::DateTime = 100 Days)

#= 

Code that reads some defaults for index_date.
These defaults could be things like:

:last_observation -- the index date of each patient is their last recorded observation 
:last_visit -- the index date of each patient is their last recorded visit
:last_refill -- the index date of each patient is their last recorded refill

And then the code would need to handle these defaults

=#

#= 

Code that will do something approximately like this:

1. Queries desired patients from person table
2. Calculates index_date default per patient
3. Applies date_prior to each patient index date

=#
end

The final dispatch I have in mind is somehow both easier and harder than the previous one. What this accepts for the index_date is a function defined by the user that calculates the index date; it can pass the connection object and vector of person ids to this function and should return a dataframe of index dates per person_id. Here's how it looks:

function period_prior_to_index(person_ids::Vector, conn; index_date::Function, date_prior::DateTime = 100 Days)

#= 

Apply the index_date function like this:

patient_date_indices = index_date(person_ids, conn)

=#

#= 

Code that will do something approximately like this:

1. Queries desired patients from person table
2. Calculates index_date default per patient
3. Applies date_prior to each patient index date

=#
end

Finally, it will be curious to see how well all the dispatches will be able to parallelize. We will want to test that out at some point too. Hope this helps @Jay-sanjay!

@Jay-sanjay
Copy link
Member Author

Hi @TheCedarPrince here is the way I have thought for the first two dispatches using FunSQL:

function period_prior_to_index(cohort_ids::Vector, conn; date_prior::DateTime = 100, tab=drug_exposure)
    cohort_entry_dates = Dict()

    for cohort_id in cohort_ids
        # Construct the SQL query
        sql = From(tab) |>
              Where(Fun.in(Get.cohort_definition_id, cohort_id)) |>
              Select(Get.cohort_definition_id, Get.cohort_start_date) |>
              q -> render(q, dialect=dialect)

        # Execute the SQL query
        result = conn.execute(String(sql))

        # Check if the result is not null
        if !isnull(result)
            # Convert the cohort_start_date to DateTime and subtract the date_prior
            adjusted_date = DateTime(result[1].cohort_start_date) - date_prior

            # Store the cohort_id and the adjusted_date in the dictionary
            cohort_entry_dates[cohort_id] = adjusted_date
        end
    end

    return cohort_entry_dates
end

and

function period_prior_to_index(person_ids::Vector, conn; index_date::Function, date_prior::DateTime = Dates.Day(100))

    # A dictionary to store the person_id and the adjusted index date
    person_index_dates = Dict()

    for person_id in person_ids

        if index_date == :last_observation
            sql = From(:observations) |> Where(:person_id => person_id) |> Select(Max(:observation_date))
        elseif index_date == :last_visit
            sql = From(:visits) |> Where(:person_id => person_id) |> Select(Max(:visit_date))
        elseif index_date == :last_refill
            sql = From(:refills) |> Where(:person_id => person_id) |> Select(Max(:refill_date))
        else
            error("Invalid index_date: $index_date")
        end

        # Rendering the SQL query
        sql_string = render(sql, dialect=SQLiteDialect())

        # Executing the SQL query
        result = conn.execute(sql_string)

        # Check if the result is not null
        if !isnull(result)

            # Calculating index_date default per patient
            adjusted_date = DateTime(result[1]) - date_prior
            
            # Applying date_prior to each patient index date
            person_index_dates[person_id] = adjusted_date
        end
    end

    return person_index_dates
end

@Jay-sanjay
Copy link
Member Author

Also I here is the first dispatch I have tried using TIderDB

function period_prior_to_index(cohort_ids::Vector, conn; date_prior::DateTime = Dates.now() - Dates.Day(100), tab="COHORT")

    cohort_entry_dates = Dict()

    for cohort_id in cohort_ids
        # Use TidierDB to get the cohort_start_date
        result = @chain db_table(conn, tab) begin
            @filter(cohort_definition_id == cohort_id)
            @select(:cohort_definition_id, :cohort_start_date)
            @collect
        end

        # Check if the result is not null
        if !isnull(result)
            # Convert the cohort_start_date to DateTime and subtract the date_prior
            adjusted_date = DateTime(result[1].cohort_start_date) - date_prior

            # Store the cohort_id and the adjusted_date in the dictionary
            cohort_entry_dates[cohort_id] = adjusted_date
        end
    end

    return cohort_entry_dates
end

@TheCedarPrince
Copy link
Member

Hey @Jay-sanjay , after spending some time experimenting with your implementations, I think the overall dispatch ideas look great. I'm looking forward to chatting further tomorrow on this to hear your opinions and thoughts here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Needs to be dicussed-designed throughly before implementing
Projects
None yet
Development

No branches or pull requests

2 participants