## Query for a Demo Site
Start by querying a site for a patient with known session data (i.e. Dose_Hst records).  We will fetch the first site for the patient and store the SIT_ID and SIT_SET_ID for subsequent queries.

In [18]:
from pymedphys import mosaiq
msq_server, pat_id1 = '.', 10003
with mosaiq.connect(msq_server) as cursor:
    sites = mosaiq.execute(cursor,
        """
        SELECT 
            SIT_ID, 
            SIT_SET_ID, 
            Site_Name,
            Notes
        FROM Site 
        WHERE 
            Version = 0 
            AND Pat_ID1 = %(pat_id1)s
        """,
        { "pat_id1": pat_id1 })

    sit_id, sit_set_id = sites[0][0], sites[0][1]
    print(f"SIT_ID:{sit_id}  SIT_SET_ID:{sit_set_id}  Site_Name:{sites[0][2]}  Notes:{sites[0][3]}")

SIT_ID:3  SIT_SET_ID:3  Site_Name:rx1  Notes:daily


## Query for Dose_Hst
For the select site, query for Dose_Hst records that are associated via SIT_ID.  List the first ten of the Dose_Hst records.

Dose_Hst is not a versioned entity, so no need to get the tip versions.

In [19]:
import pprint as pprint

with mosaiq.connect(msq_server) as cursor:
    dose_hsts = mosaiq.execute(cursor, 
        """
        SELECT 
            Tx_DtTm
        FROM Dose_Hst 
        WHERE 
            Dose_Hst.SIT_ID = %(sit_id)s
        ORDER BY Tx_DtTm
        """, 
        { "sit_id": sit_id })

    dose_hst_datetimes = [dose_hst[0].strftime('%Y-%m-%d %H:%M') 
                        for dose_hst in dose_hsts]
    print(f"Dose_Hst records for site {sites[0][2]}:")
    pprint.pprint(dose_hst_datetimes[:10])

Dose_Hst records for site rx1:
['2021-05-31 14:07',
 '2021-05-31 14:12',
 '2021-05-31 14:16',
 '2021-05-31 14:19',
 '2021-06-02 14:08',
 '2021-06-02 14:11',
 '2021-06-02 14:14',
 '2021-06-02 14:19',
 '2021-06-03 14:06',
 '2021-06-03 14:12']


## Cluster Dose_Hst in to sessions
Mosaiq's data schema doesn't explicitly group Dose_Hst and Offset records in to sessions/fractions, but a simple clustering trick is generally enough to form the sessions.

The ```cluster_sessions``` function uses a [hierarchical clustering algorithm](https://scikit-learn.org/stable/modules/generated/sklearn.cluster.AgglomerativeClustering.html) to cluster the Dose_Hst records.  

To demonstrate the clustering, generate a list of irregularly spaced datetimes to pass to ```cluster_sessions```, which will return a tuple for each session of:
* session number (from 1)
* session start date/time
* session end date/time

In [20]:
from session_offsets_calculator import cluster_sessions
from datetime import datetime, timedelta

test_datetimes = [datetime.now() + timedelta(hours=h*5 + j) 
                    for h in range(3) for j in range(3)]

print('Mock tx date/times:')
for datetime in test_datetimes: 
    print('\t', str(datetime))

print(f"\nClustered in to sessions with {3} hour inverval:")
for session in cluster_sessions(test_datetimes, interval=timedelta(hours=3)):
    print(f"\tSession#{session[0]}: {str(session[1])} to {str(session[2])}")

Mock tx date/times:
	 2021-01-15 12:42:23.780435
	 2021-01-15 13:42:23.780435
	 2021-01-15 14:42:23.780435
	 2021-01-15 17:42:23.780435
	 2021-01-15 18:42:23.780435
	 2021-01-15 19:42:23.780435
	 2021-01-15 22:42:23.780435
	 2021-01-15 23:42:23.780435
	 2021-01-16 00:42:23.780435

Clustered in to sessions with 3 hour inverval:
	Session#1: 2021-01-15 12:42:23.780435 to 2021-01-15 14:42:23.780435
	Session#2: 2021-01-15 17:42:23.780435 to 2021-01-15 19:42:23.780435
	Session#3: 2021-01-15 22:42:23.780435 to 2021-01-16 00:42:23.780435


The sessions for the queried site can now be created using the ```sessions_for_site``` function, which first queries for Dose_Hst records, and then calls ```cluster_sessions``` on the Dose_Hst.Tx_DtTm.

In [21]:
from session_offsets_calculator import sessions_for_site

with mosaiq.connect(msq_server) as cursor:
      print(f"Dose_Hst.Tx_DtTm-based session intervals "
            f"for SIT_SET_ID = {sit_set_id} in Msq db:")
      for session in sessions_for_site(cursor, sit_set_id):
            print(f"\tSession#{session[0]}: {str(session[1])} to {str(session[2])}")

Dose_Hst.Tx_DtTm-based session intervals for SIT_SET_ID = 3 in Msq db:
	Session#1: 2021-05-31 14:07:00 to 2021-05-31 14:19:00
	Session#2: 2021-06-02 14:08:00 to 2021-06-02 14:19:00
	Session#3: 2021-06-03 14:06:00 to 2021-06-03 14:20:00
	Session#4: 2021-06-04 14:09:00 to 2021-06-04 14:25:00
	Session#5: 2021-06-07 14:08:00 to 2021-06-07 14:25:00
	Session#6: 2021-06-08 14:10:00 to 2021-06-08 14:27:00
	Session#7: 2021-06-09 14:06:00 to 2021-06-09 14:19:00
	Session#8: 2021-06-10 14:07:00 to 2021-06-10 14:21:00
	Session#9: 2021-06-11 14:10:00 to 2021-06-11 14:23:00
	Session#10: 2021-06-14 14:09:00 to 2021-06-14 14:24:00
	Session#11: 2021-06-15 14:09:00 to 2021-06-15 14:22:00
	Session#12: 2021-06-16 14:07:00 to 2021-06-16 14:21:00
	Session#13: 2021-06-17 14:06:00 to 2021-06-17 14:17:00
	Session#14: 2021-06-18 14:07:00 to 2021-06-18 14:18:00
	Session#15: 2021-06-21 14:05:00 to 2021-06-21 14:18:00
	Session#16: 2021-06-22 14:08:00 to 2021-06-22 14:24:00
	Session#17: 2021-06-23 14:06:00 to 2021-0

## Session Offsets
Now that we can get the sessions for a site, we can also query to find any Offsets that occur within a +/- 1 hour time window of the session interval.  This is done by the ```session_offsets_for_site``` function.

Session offsets are returned by a generator as a tuple of Session Number and Offset values.  If no Offset falls within the window for a session, then None is returned for the offset member of the tuple.

In [22]:
from session_offsets_calculator import session_offsets_for_site

with mosaiq.connect(msq_server) as cursor:
    print(f"Offset records for SIT_SET_ID {sit_set_id}")    
    for session_num, offset in session_offsets_for_site(
        cursor, sit_set_id, interval=timedelta(hours=1)
    ):
        if offset:
            print(f"\tSession#{session_num}: "
                f"{offset[0].strftime('%Y-%m-%d %H:%M')}: "
                f"{offset[1]}/{offset[2]}/{offset[3]}")
        else:
            print(f"\tSession#{session_num}: no session offsets")

Offset records for SIT_SET_ID 3
	Session#1: 2021-05-31 14:04: -4.4/4.4/-2.8
	Session#2: 2021-06-02 14:02: 3.4/0.4/-0.1
	Session#3: 2021-06-03 14:02: 1.4/-1.5/-4.9
	Session#4: 2021-06-04 14:05: 4.6/1.7/-4.4
	Session#5: 2021-06-07 14:02: 3.6/-1.1/-1.9
	Session#6: 2021-06-08 14:04: -1.1/-4.7/-4.1
	Session#7: no session offsets
	Session#8: 2021-06-10 14:02: -3.7/-4.4/-5
	Session#9: 2021-06-11 14:05: -2/-2/1.6
	Session#10: 2021-06-14 14:04: 4.3/-3.1/0.4
	Session#11: 2021-06-15 14:03: -0.8/4.7/3.6
	Session#12: 2021-06-16 14:04: 4/1.1/1.7
	Session#13: 2021-06-17 14:02: -3/2.4/4.5
	Session#14: 2021-06-18 14:04: -4.2/2.4/-3.9
	Session#15: 2021-06-21 14:02: 3/0/-3.7
	Session#16: 2021-06-22 14:02: -3.7/4.6/-5
	Session#17: 2021-06-23 14:03: 2.1/-3.2/0.6
	Session#18: 2021-06-24 14:04: -2.1/-1.8/3.1
	Session#19: 2021-06-25 14:03: -4.9/2.6/-4.8
	Session#20: 2021-06-28 14:04: -2.7/-1.5/-3.4
