## Query for a Demo Site
Start by querying a site for a patient with known session data (i.e. Dose_Hst records).  We will fetch the first site for the patient and store the SIT_ID and SIT_SET_ID for subsequent queries.

In [2]:
from pymedphys import mosaiq
msq_server, test_db_name, pat_id1 = '.', 'MosaiqTest94086', 10003

connection = mosaiq.connect(msq_server, database=test_db_name)
sites = mosaiq.execute(
    connection,
    """
    SELECT 
        SIT_ID, 
        SIT_SET_ID, 
        Site_Name,
        Notes
    FROM Site 
    WHERE 
        Version = 0 
        AND Pat_ID1 = %(pat_id1)s
    """,
    { "pat_id1": pat_id1 })

sit_id, sit_set_id = sites[0][0], sites[0][1]
print(f"SIT_ID:{sit_id}  SIT_SET_ID:{sit_set_id}  Site_Name:{sites[0][2]}  Notes:{sites[0][3]}")

SIT_ID:3  SIT_SET_ID:3  Site_Name:rx1  Notes:weekly


## Query for Dose_Hst
For the select site, query for Dose_Hst records that are associated via SIT_ID.  List the first ten of the Dose_Hst records.

Dose_Hst is not a versioned entity, so no need to get the tip versions.

In [3]:
import pprint as pprint

dose_hsts = mosaiq.execute(
    connection,
    """
    SELECT 
        Tx_DtTm
    FROM Dose_Hst 
    WHERE 
        Dose_Hst.SIT_ID = %(sit_id)s
    ORDER BY Tx_DtTm
    """, 
    { "sit_id": sit_id })

dose_hst_datetimes = [dose_hst[0].strftime('%Y-%m-%d %H:%M') 
                    for dose_hst in dose_hsts]
print(f"Dose_Hst records for site {sites[0][2]}:")
pprint.pprint(dose_hst_datetimes[:10])

Dose_Hst records for site rx1:
['2021-04-15 11:05',
 '2021-04-15 11:09',
 '2021-04-15 11:13',
 '2021-04-15 11:19',
 '2021-04-15 11:22',
 '2021-04-15 11:25',
 '2021-04-15 11:28',
 '2021-04-16 11:05',
 '2021-04-16 11:08',
 '2021-04-16 11:11']


## Cluster Dose_Hst in to sessions
Mosaiq's data schema doesn't explicitly group Dose_Hst and Offset records in to sessions/fractions, but a simple clustering trick is generally enough to form the sessions.

The ```cluster_sessions``` function uses a [hierarchical clustering algorithm](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html) to cluster the Dose_Hst records.  

To demonstrate the clustering, generate a list of irregularly spaced datetimes to pass to ```cluster_sessions```, which will return a tuple for each session of:
* session number (from 1)
* session start date/time
* session end date/time

In [4]:
from pymedphys._mosaiq.sessions import cluster_sessions
from datetime import datetime, timedelta

test_datetimes = [datetime.now() + timedelta(hours=h*5 + j) 
                    for h in range(3) for j in range(3)]

print('Mock tx date/times:')
for datetime in test_datetimes: 
    print('\t', str(datetime))

print(f"\nClustered in to sessions with {3} hour inverval:")
for session in cluster_sessions(test_datetimes, interval=timedelta(hours=3)):
    print(f"\tSession#{session[0]}: {str(session[1])} to {str(session[2])}")

Mock tx date/times:
	 2021-02-09 18:01:52.573723
	 2021-02-09 19:01:52.573723
	 2021-02-09 20:01:52.573723
	 2021-02-09 23:01:52.573723
	 2021-02-10 00:01:52.573723
	 2021-02-10 01:01:52.573723
	 2021-02-10 04:01:52.573723
	 2021-02-10 05:01:52.573723
	 2021-02-10 06:01:52.573723

Clustered in to sessions with 3 hour inverval:
	Session#1: 2021-02-09 18:01:52.573723 to 2021-02-09 20:01:52.573723
	Session#2: 2021-02-09 23:01:52.573723 to 2021-02-10 01:01:52.573723
	Session#3: 2021-02-10 04:01:52.573723 to 2021-02-10 06:01:52.573723


The sessions for the queried site can now be created using the ```sessions_for_site``` function, which first queries for Dose_Hst records, and then calls ```cluster_sessions``` on the Dose_Hst.Tx_DtTm.

In [5]:
from pymedphys._mosaiq.sessions import sessions_for_site

print(f"Dose_Hst.Tx_DtTm-based session intervals "
      f"for SIT_SET_ID = {sit_set_id} in Msq db:")
for session in sessions_for_site(connection, sit_set_id):
    print(f"\tSession#{session[0]}: {str(session[1])} to {str(session[2])}")

Dose_Hst.Tx_DtTm-based session intervals for SIT_SET_ID = 3 in Msq db:
	Session#1: 2021-04-15 11:05:00 to 2021-04-15 11:28:00
	Session#2: 2021-04-16 11:05:00 to 2021-04-16 11:30:00
	Session#3: 2021-04-19 11:10:00 to 2021-04-19 11:38:00
	Session#4: 2021-04-21 11:07:00 to 2021-04-21 11:35:00
	Session#5: 2021-04-22 11:05:00 to 2021-04-22 11:29:00
	Session#6: 2021-04-23 11:06:00 to 2021-04-23 11:31:00
	Session#7: 2021-04-26 11:04:00 to 2021-04-26 11:33:00
	Session#8: 2021-04-27 11:09:00 to 2021-04-27 11:41:00
	Session#9: 2021-04-28 11:07:00 to 2021-04-28 11:40:00
	Session#10: 2021-04-29 11:07:00 to 2021-04-29 11:33:00
	Session#11: 2021-04-30 11:03:00 to 2021-04-30 11:32:00
	Session#12: 2021-05-03 11:04:00 to 2021-05-03 11:31:00
	Session#13: 2021-05-04 11:06:00 to 2021-05-04 11:31:00
	Session#14: 2021-05-06 11:05:00 to 2021-05-06 11:32:00
	Session#15: 2021-05-10 11:03:00 to 2021-05-10 11:31:00
	Session#16: 2021-05-11 11:08:00 to 2021-05-11 11:39:00
	Session#17: 2021-05-12 11:06:00 to 2021-0

## Session Offsets
Now that we can get the sessions for a site, we can also query to find any Offsets that occur within a +/- 1 hour time window of the session interval.  This is done by the ```session_offsets_for_site``` function.

Session offsets are returned by a generator as a tuple of Session Number and Offset values.  If no Offset falls within the window for a session, then None is returned for the offset member of the tuple.

In [6]:
from pymedphys._mosaiq.sessions import session_offsets_for_site

print(f"Offset records for SIT_SET_ID {sit_set_id}")    
for session_num, offset in session_offsets_for_site(
    connection, sit_set_id, interval=timedelta(hours=1)
):
    if offset:
        print(f"\tSession#{session_num}: "
            f"{offset[0].strftime('%Y-%m-%d %H:%M')}: "
            f"{offset[1]}/{offset[2]}/{offset[3]}")
    else:
        print(f"\tSession#{session_num}: no session offsets")

Offset records for SIT_SET_ID 3
	Session#1: no session offsets
	Session#2: no session offsets
	Session#3: 2021-04-19 11:04: 0.6/-1.3/-4.9
	Session#4: 2021-04-21 11:04: -4.5/-4.8/2.8
	Session#5: no session offsets
	Session#6: no session offsets
	Session#7: no session offsets
	Session#8: 2021-04-27 11:03: -3.3/2.5/2.3
	Session#9: 2021-04-28 11:02: -3.9/-0.4/-4.2
	Session#10: 2021-04-29 11:02: 3.6/-3.8/4.4
	Session#11: no session offsets
	Session#12: no session offsets
	Session#13: no session offsets
	Session#14: no session offsets
	Session#15: no session offsets
	Session#16: 2021-05-11 11:02: -0.6/2/-4
	Session#17: no session offsets
	Session#18: 2021-05-13 11:02: 4.3/4.7/1.3
	Session#19: no session offsets
	Session#20: no session offsets
	Session#21: 2021-05-18 11:05: -0.1/3.4/-3.2
	Session#22: 2021-05-19 11:05: 4.3/2.2/0.2
	Session#23: 2021-05-20 11:05: -1.1/-4.1/-2.5
	Session#24: no session offsets
	Session#25: no session offsets
