## App Design

* Idea:
    * Application would be a chat-bot
    * Integrated into health tracker apps like: Garmin Connect, Apple's built-in Health App, Strava, Fitbit, etc.
    * Intention is to interact with the data (i.e. heart rates, paces, etc.) within these kind of running applications, that the LLM app can access and interact with to answer user questions
    * And theoretically it could search for and have access to other online information, to answer queries that aren't asking for more simpler quantitative measures like pace ranges or weekly training plans (i.e. targeted weight lifting exercises, injury prevention exercises, injury-related questions, and other questions with variable answers depending on the individual like personalized weekly-mileage efforts)

### Premise:
The idea behind this application is it's access to specific user training data, which would be given & recorded by another application. As athletes ourselves, we have access to this kind of data ourselves, and so we can provide sample data files of the kind of data the application would need:

### Example 1 kind of training data: 
> Run-specific metrics (heart rates, paces, zones, etc. for a particular run on a given day) 

* This is the associated statistics for a sample run that our app might need to answer a user question
  * This information is available with the Garmin Connect App

<img src="sample_run_1.png" width="90%"/>

* This is a chart displaying Heart Rate Zones throughout the run:
<img src="heart_rates_1.png" width="90%"/>

* An associated pace chart with specific times, heart rates, cadence, etc. per lap is contained within the following csv file:

In [1]:
### input sample run 1 csv

import pandas as pd

df = pd.read_csv('sample_run_1.csv')
sample_1=df[0:16]
sample_1

Unnamed: 0,aps,Time,Cumulative Time,Distance,Avg Pace,Avg GAP,Avg HR,Max HR,Total Ascent,Total Descent,...,Avg GCT Balance,Avg Stride Length,Avg Vertical Oscillation,Avg Vertical Ratio,Calories,Avg Temperature,Best Pace,Max Run Cadence,Moving Time,Avg Moving Pace
0,,,,mi,min/mi,min/mi,bpm,bpm,ft,ft,...,%,m,cm,%,C,,min/mi,spm,,min/mi
1,2,00:43.6,10:06,0.08,8:53,,148,151,0,0,...,,1.03,,,9,,8:49,179,0:42,8:34
2,3,02:01.5,12:07,0.31,6:31,,158,163,0,0,...,,1.37,,,31,,6:02,185,02:01.5,6:31
3,4,02:01.3,14:08,0.2,10:03,,148,162,0,0,...,,0.92,,,26,,6:24,185,2:01,10:02
4,5,02:01.7,16:10,0.3,6:44,,157,164,0,0,...,,1.31,,,28,,6:28,187,02:01.7,6:44
5,6,02:00.4,18:10,0.23,8:34,,154,162,0,0,...,,1.06,,,27,,6:36,183,2:00,8:32
6,7,02:00.5,20:11,0.29,6:52,,162,170,0,0,...,,1.28,,,28,,6:05,187,02:00.5,6:52
7,8,02:00.0,22:11,0.26,7:42,,160,167,0,0,...,,1.16,,,28,,6:26,185,02:00.0,7:42
8,9,02:01.3,24:12:00,0.28,7:18,,166,172,0,0,...,,1.2,,,29,,6:36,187,2:01,7:17
9,10,02:00.5,26:13:00,0.26,7:47,,164,171,0,0,...,,1.16,,,28,,7:02,183,2:00,7:45


### Example 2 kind of training data:
> Sample workout training for 4 weeks

* I input another piece of training data that the LLM can refer to when crafting training plans (i.e. for queries requesting a weekly/daily/monthly/etc. training breakdown)
   * This data is titled: Sample training plan
   * the LLM can access this tool as a reference for how to structure its output

In [2]:
### Loading the text file:

with open("sample_training_plan.txt", "r") as file:
    content = file.read()

In [3]:
### Print first week of what sample training would look like:

print(content[:3500])

Penn Women’s XC Summer Training
Block 3 (July 29th – August 23rd)

Block 4 Objective – The primary goal of this workout is to introduce two new types of workouts – 1) progression runs (which will then become more intensive tempo effots) and 2) interval efforts at race effort or race pace. 

The progression runs are a continuation of the marathon paced work that we have been doing. This will continue to help you gain aerobic strength and cardiovascular fitness. The priotity in progression runs is the ability to get faster every mile. This means that these efforts will be harder and more taxing than a marathon paced run. Make sure you know your targets when you are starting these workouts. 

The interval efforts are the introduction of race pace work. As we get into August, we are less than 1 month away from our first competition of the season. This means we are going to start entering our pre-competition phase of training. The reminder is that we do NOT want to be at our peak fitness in

### Example 3:
> Online information (i.e. VDOT chart)

* A third kind of information the model should/would have access to is supplementary online information that it would search for when prompted and would refine its output accuracy
* For instance, the following is an implementation of a VDOT chart found online:


In [4]:
import requests
import tempfile
import os
import webbrowser

def open_pdf_from_url(url):
    # Download the PDF
    print(f"Downloading PDF from {url}...")
    response = requests.get(url)
    
    if response.status_code == 200:
        # Create a temporary file
        with tempfile.NamedTemporaryFile(delete=False, suffix='.pdf') as temp_file:
            temp_file.write(response.content)
            temp_file_path = temp_file.name
        
        print(f"PDF downloaded successfully to temporary file: {temp_file_path}")
        
        # Open the PDF with the default PDF viewer
        print("Opening PDF with default viewer...")
        webbrowser.open('file://' + os.path.realpath(temp_file_path))
        
        return temp_file_path
    else:
        print(f"Failed to download PDF: HTTP status code {response.status_code}")
        return None

# Open the specific PDF
pdf_url = "https://sdtrackmag.com/DanielsOneSheet.pdf"
temp_file = open_pdf_from_url(pdf_url)

Downloading PDF from https://sdtrackmag.com/DanielsOneSheet.pdf...
PDF downloaded successfully to temporary file: /tmp/tmpke98t44o.pdf
Opening PDF with default viewer...


* This particular file had to be opened in this way because it was an online pdf link,
* But in other cases where the model needs access to, say, online published scholarly/technical documents (like recovery practices, injury prevention and management practices, literature reviews on the most effective and efficient running practices, etc.), these kinds of documents can be read and opened as .txt files

### Theoretical Implementation:
  * The idea behind this app is primarily theoretical, as we were caught up on the technicalities of implementing langchain and RAG techniques
  * However, were we to implement properly, it would follow this kind of pipeline:
    * Loading the relevant data/documents: online resources (VDOT charts, physiological-related literature, best weightlifting/supplementary exercises), sample training plans (for context), user-specifc biometric/run-specific data (i.e. heart rates, paces, PR's; all would be provided within their app of choice)
    * split into chucks then load chunks into vector store
    * Then finally set up langchain RAG chains