# **Fitness RAG: A RAG model for fitness**

A RAG pipeline has been developed using Llama Index. Generation is carried out using Llama 3, an open-source LLM model by Meta. The results are augmented with fitness data, stored and embedded in the Chroma database as vectors. Ollama serves as the interface to interact with Llama 3.

# Pre-Requisites

**Packages**

In [1]:
import pandas as pd
!pip install sentence-transformers
# Install prerequisites
!pip install llama-index-embeddings-huggingface
!pip install llama-index-llms-ollama
!pip install llama-index ipywidgets
!pip install llama-index-llms-huggingface
!pip install llama_index.readers.web
!pip install llama-index-vector-stores-chroma
!pip install chromadb


Collecting sentence-transformers
  Downloading sentence_transformers-3.0.0-py3-none-any.whl (224 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m224.7/224.7 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch>=1.11.0->sentence-transform

**Setting up the Model**

In [2]:
%%capture
# Install Ollama v0.1.30
!curl https://ollama.ai/install.sh | sed 's#https://ollama.ai/download#https://github.com/jmorganca/ollama/releases/download/v0.1.30#' | sh

In [3]:
%%capture
# Setup the model as a global variable
OLLAMA_MODEL='llama3:latest'

# Add the model to the environment of the operating system
import os
os.environ['OLLAMA_MODEL'] = OLLAMA_MODEL
!echo $OLLAMA_MODEL # print the global variable to check it saved

import subprocess
import time

# Start ollama on the server ("serve")
command = "nohup ollama serve&" # "nohup" and "&" means run in the background

# Use subprocess.Popen to run the command
process = subprocess.Popen(command,
                            shell=True,
                            stdout=subprocess.PIPE,
                            stderr=subprocess.PIPE)

time.sleep(5)  # Makes Python wait for 5 seconds

# Import required modules from the llama_index library
from llama_index.core import VectorStoreIndex, SummaryIndex, SimpleDirectoryReader
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings
from llama_index.llms.ollama import Ollama
from llama_index.core import StorageContext

# Import ChromaVectorStore and chromadb module
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadb

# Import the Ollama class
from llama_index.llms.ollama import Ollama

# Use the global variable (OLLAMA_MODEL) as our LLM
# Set a timeout of 8 minutes in case of CPU
llm = Ollama(model=OLLAMA_MODEL, request_timeout=480.0)

In [4]:
# Query the model via the command line
# First time running it will "pull" (import) the model
!ollama run $OLLAMA_MODEL "Tell me a joke"

[?25lpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest ⠴ [?25h[?25l[2K[1Gpulling manifest ⠦ [?25h[?25l[2K[1Gpulling manifest ⠦ [?25h[?25l[2K[1Gpulling manifest ⠇ [?25h[?25l[2K[1Gpulling manifest ⠏ [?25h[?25l[2K[1Gpulling manifest ⠋ [?25h[?25l[2K[1Gpulling manifest ⠋ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest ⠼ [?25h[?25l[2K[1Gpulling manifest 
pulling 6a0746a1ec1a...   0% ▕▏    0 B/4.7 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling 6a0746a1ec1a...   0% ▕▏    0 B/4.7 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling 6a0746a1ec1a...   0% ▕▏    0 B/4.7 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling 6a0746a1ec1a...   0% ▕▏    0 B/4.7 GB                  [?25h[?25l[2K[1G[A[2K[1G

**Initializing embedding model**

In [5]:
from sentence_transformers import SentenceTransformer

# Load the pre-trained embedding model
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]



config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

# Data Processing

**Importing the DataSet**

In [6]:
from google.colab import files

# Upload the CSV file
uploaded = files.upload()

# Assuming the file is named 'Fitness_Unformatted.csv'
file_name = list(uploaded.keys())[0]


Saving FITNESS.csv to FITNESS.csv


**Loading Data For Chunking**


In [7]:
from llama_index.readers.file import CSVReader
from llama_index.core.node_parser import SentenceSplitter
from pathlib import Path # for finding the file

Fitness_docs = CSVReader().load_data(Path(file_name))


**Data Checking**

In [8]:
import pandas as pd
df=pd.read_csv(file_name)
print(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 928 entries, 0 to 927
Data columns (total 2 columns):
 #   Column     Non-Null Count  Dtype 
---  ------     --------------  ----- 
 0   Human      928 non-null    object
 1   Assistant  928 non-null    object
dtypes: object(2)
memory usage: 14.6+ KB
None


In [9]:
print(df.head())

                                               Human  \
0           can you recommend effective ab exercises   
1  what are some effective strategies for managin...   
2  how can i incorporate regular movement and phy...   
3  are there any specific strategies for maintain...   
4  how can i manage stress and maintain a healthy...   

                                           Assistant  
0  planks bicycle crunches and leg raises are gre...  
1   make time for relaxation take time to catch y...  
2   take the stairs instead of the elevator whene...  
3   eat a healthy diet eating a healthy diet rich...  
4   manage your time wisely prioritize your tasks...  


In [10]:
print(df.isna().sum())


Human        0
Assistant    0
dtype: int64


Sentence Based Chunking is done to split the dataset. After splitting the list is indexed and added into the chromadB.

# **Chunking**


In [11]:
parser = SentenceSplitter(chunk_size=250, chunk_overlap=0)
fit_nodes = parser.get_nodes_from_documents(Fitness_docs)


Printing the output to check the structure of parser

In [12]:
print(fit_nodes)

[TextNode(id_='44cfffa5-b5a0-40da-b7b1-032733abc694', embedding=None, metadata={'filename': 'FITNESS.csv', 'extension': '.csv'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='0872f5e4-1461-4d6c-90db-791910741eee', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'filename': 'FITNESS.csv', 'extension': '.csv'}, hash='8fb33f6704364f48500ecfb26200f5a1bf486442de284c234598906688515324'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='39c535ac-8f1e-4a07-a8d4-ae009de3c8f6', node_type=<ObjectType.TEXT: '1'>, metadata={}, hash='302a0a2c98e9403e12e3d5fee1eac79b9c11d9ae502a759043524c7c98c40533')}, text='Human, Assistant\ncan you recommend effective ab exercises, planks bicycle crunches and leg raises are great for targeting core muscles incorporate these exercises into your routine and focus on engaging your core during other workouts too\nwhat are some effective strategies for managing and reducing s

Checking the type of fit_nodes

In [13]:
print(type(fit_nodes))

<class 'list'>


Converting fit_nodes as a Document Object

In [14]:
from llama_index.core import VectorStoreIndex, StorageContext, Document
fit_docs=[Document(doc_id=r.node_id, text=r.text) for r in fit_nodes]

In [15]:
print(fit_docs)

[Document(id_='44cfffa5-b5a0-40da-b7b1-032733abc694', embedding=None, metadata={}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Human, Assistant\ncan you recommend effective ab exercises, planks bicycle crunches and leg raises are great for targeting core muscles incorporate these exercises into your routine and focus on engaging your core during other workouts too\nwhat are some effective strategies for managing and reducing stressrelated symptoms such as anxiety or insomnia,  make time for relaxation take time to catch your breath with daily activities that you find enjoyable such as exercise reading listening to music or taking a hot bath\n\n connect with others reach out to family friends colleagues and online support groups\n\n learn mindful practices mindfulness techniques such as yoga meditation and guided imagery can help reduce stress\n\n engage in healthy activities eating a balanced diet spending time outdoors and getting adequate r

# Vector DataBase

**Vector Data Base Creation**


In [16]:
!pip install llama-index chromadb
!pip install llama-index sentence-transformers chromadb




In [17]:
!pip install sentence-transformers llama-index

# Create client ("db") and a database ("chroma_db")
db = chromadb.PersistentClient(path="./chroma_db")

# Import ChromaVectorStore and chromadb module
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadb



**Embedding model initialization**

In [18]:
 embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")

Settings.llm = llm
Settings.embed_model = embed_model

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]



config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [19]:

# Create client ("db") and a database ("chroma_db")
db = chromadb.PersistentClient(path="./chroma_db")

# Create a collection/table ("Fitness_trial") in the db
chroma_collection = db.create_collection("Fitness_trial")

# Set up ChromaVectorStore and load in data
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
# Specify Chroma as our vector db
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Create the vector index
vector_index = VectorStoreIndex.from_documents(
    documents=fit_docs, # the file created earlier
    storage_context = storage_context,
)

# Print the metadata
print(chroma_collection)

# Print the name of the collection (table)
print(f'Collection name is: {chroma_collection.name}')

name='Fitness_trial' id=UUID('8afeded8-70f5-471c-bb7d-c1d04e591186') metadata=None tenant='default_tenant' database='default_database'
Collection name is: Fitness_trial


# Prompt Template

In [20]:
from llama_index.core.llms import ChatMessage, MessageRole
from llama_index.core import ChatPromptTemplate

qa_prompt_str = (
    "Context information is below.\n"
    "---------------------\n"
    "{context_str}\n"
    "---------------------\n"
    "Given the context information and not prior knowledge, "
    "answer the question: {query_str}\n"
)

# Text QA Prompt
chat_text_qa_msgs = [
    ChatMessage(
        role=MessageRole.SYSTEM,
        content=(
            "Always answer the question, even if the context isn't helpful."
        ),
    ),
    ChatMessage(role=MessageRole.USER, content=qa_prompt_str),
]

text_qa_template = ChatPromptTemplate(chat_text_qa_msgs)

# Querying

**Compact querying**

In [None]:
query_engine = vector_index.as_query_engine(response_mode="compact")

response = query_engine.query("Suggest Biceps Workout")

response.response

print(response.response)



Based on the provided context information, here's a suggested biceps workout:

1. Dumbbell preacher curl (3 sets of 12-15 reps)
2. Standing dumbbell reverse curl (3 sets of 10-12 reps)
3. Palms-out incline biceps curl (3 sets of 12-15 reps)
4. Biceps curl to shoulder press complex (3 sets of 8-10 reps per exercise)
5. Concentration curl (3 sets of 10-12 reps)

This workout incorporates a variety of exercises that target the biceps from different angles, allowing for a well-rounded and effective workout. Remember to adjust the weight and reps based on your fitness level and goals.


**Comparison with a Different Type of Query**

In [None]:
query_engine = vector_index.as_query_engine(response_mode="compact")

response = query_engine.query("Suggest excersises for chest")

response.response

print(response.response)

Considering your overall wellness approach, I'd recommend exercises that are gentle on your joints while still targeting the chest muscles. Here's a suggestion:

Try incorporating some modified push-ups or wall push-ups into your routine. These exercises can help engage your chest muscles without putting excessive strain on your joints.

Alternatively, you could consider doing some dumbbell or resistance band exercises that focus on the chest area. These can be done in a controlled manner to ensure proper form and minimize risk of injury.

Remember to start slowly and gradually increase the intensity and duration as your body adapts. It's also essential to listen to your body and adjust your exercise routine accordingly.

By incorporating these exercises into your workout routine, you'll be taking care of your overall well-being while also addressing specific areas like chest strength.


**Refined Querying**

In [None]:
query_engine = vector_index.as_query_engine(response_mode="refine")

response = query_engine.query("Suggest Biceps Workout")

response.response

print(response.response)



Here's a rewritten answer incorporating the new context:

To build and strengthen the biceps and shoulders or deltoids, I'd suggest the following workout routine. Start with the dumbbell complex that combines the biceps curl to shoulder press, focusing on one rep of each movement at a time. You can adjust the reps in successive rounds as needed.

Next, include exercises that target specific aspects of the muscle. Consider adding concentration curls to emphasize the peak of the bicep. Perform these with moderate to high reps (12-15 or more) for 3-4 sets, resting for 60-90 seconds between sets. Adjust the weight and reps based on your fitness level and goals.

Lastly, don't forget to incorporate exercises that target the forearm and grip strength simultaneously. The standing dumbbell reverse curl is an excellent addition to any workout routine. Use lower reps with heavier weights to focus on building overall arm strength.


**Tree Summarize Querying**

In [None]:
query_engine = vector_index.as_query_engine(response_mode="tree_summarize")

response = query_engine.query("Suggest Biceps Workout")

response.response

print(response.response)



Based on the provided context, I suggest a comprehensive biceps workout that incorporates various exercises and reps. Here's a possible routine:

1. Warm-up: Start with some light cardio or arm circles to get your blood flowing and prepare your muscles for the upcoming workout.
2. Dumbbell Seated Biceps Curl: 3 sets of 12-15 reps
	* Focus on strict form and mind-muscle connection to target the biceps peak.
3. Dumbbell Preacher Curl: 3 sets of 10-12 reps
	* Use a moderate to high rep range to build strength and size in the biceps, particularly the peak.
4. Standing Dumbbell Reverse Curl: 2 sets of 8-10 reps
	* Alternate between palms-down and palms-up grip to target both biceps and forearms.
5. Biceps Curl to Shoulder Press Complex: 3 sets of 6-8 reps (1 rep each of curl and press)
	* Combine two exercises in one set to efficiently hit the biceps and shoulders.
6. Concentration Curl: 2 sets of 10-12 reps
	* Finish off your workout with this classic exercise, focusing on emphasizing the 

# Testing

**Testing Compact Querying**

Workout routine for specific gender and goal

In [None]:
query_engine = vector_index.as_query_engine(response_mode="compact")

response = query_engine.query("Suggest a workout routine  for women aiming weight loss")

response.response

print(response.response)

I'd be happy to help!

To suggest a workout routine for women aiming at weight loss, I would recommend defining specific goals, whether it's strength training, endurance, fat loss, or a combination. It's essential to research and choose a program aligned with individual preferences and time commitment.

For weight loss, a well-rounded routine that combines cardio and strength training can be effective. Aim for 150-200 minutes of moderate-intensity exercise per week, which can include activities like:

* High-intensity interval training (HIIT) workouts
* Brisk walking or jogging
* Swimming or cycling
* Strength training exercises like squats, lunges, push-ups, and rows

In addition to regular exercise, focus on a balanced diet that includes healthy proteins, complex carbohydrates, and fats. Aim to include a variety of foods from different food groups to ensure you're getting all the essential nutrients your body needs.

Remember to start slow, set achievable goals, and gradually increas

Meal plan for a specific dietary requirement and goal

In [None]:
query_engine = vector_index.as_query_engine(response_mode="compact")

response = query_engine.query("Give a protien rich meal plan for vegetarians aiming muscle gain")

response.response

print(response.response)

As a protein-rich meal plan for vegetarians aiming to gain muscle, I recommend incorporating a variety of plant-based foods that are high in protein. Here's a sample meal plan:

Breakfast:

* Tofu scramble with spinach, mushrooms, and whole wheat toast (20g protein)
* Quinoa breakfast bowl with almond milk, chia seeds, and sliced almonds (15g protein)

Snack:

* Greek yogurt with hemp seeds and berries (15g protein)
* Edamame and whole grain crackers (10g protein)

Lunch:

* Lentil soup with quinoa and mixed vegetables (20g protein)
* Grilled tofu or tempeh wrap with brown rice, avocado, and sprouts (25g protein)

Snack:

* Protein smoothie with pea protein powder, almond milk, banana, and spinach (30g protein)
* Roasted chickpeas seasoned with herbs and spices (10g protein)

Dinner:

* Quinoa and black bean bowl with roasted vegetables and a drizzle of tahini sauce (20g protein)
* Grilled portobello mushrooms with brown rice, steamed broccoli, and a side of almonds (25g protein)

Befo

Overall suggestion for a person with chronic illness

In [None]:
query_engine = vector_index.as_query_engine(response_mode="compact")

response = query_engine.query("Iam a person with neurological conditions how can i improve my fitness")

response.response

print(response.response)

Considering the provided context information, it seems that regular physical activity is essential for overall health and wellbeing. For someone with neurological conditions, I would recommend consulting with a healthcare professional to determine the most suitable exercises or activities for their specific condition.

In general, exercising studies have shown that regular physical activity can help slow down age-related cognitive decline, improve brain activity, increase cognitive reserve, and reduce age-related risks of dementia and Alzheimer's. Additionally, exercise has been found to improve overall mental health and wellbeing.

Given your neurological conditions, I would suggest starting with low-impact aerobic exercises, such as yoga or swimming, which can help improve cardiovascular fitness while minimizing the risk of injury. You may also consider working with a physical therapist or a certified personal trainer who specializes in exercising with neurological conditions.

Remem

Well-Being and Nutrition for specific conditions

In [21]:
query_engine = vector_index.as_query_engine(response_mode="compact")

response = query_engine.query("iam highly stressed due to my exams how can i improve my well being and nutrition")

response.response

print(response.response)

I understand that you're feeling highly stressed due to your exams. To improve your well-being and nutrition, I would recommend exercising regularly, which can help reduce stress and anxiety by releasing endorphins. Additionally, practicing relaxation techniques such as deep breathing or meditation can also help manage your stress levels.

In terms of nutrition, eating a well-balanced diet with plenty of fresh fruits and vegetables can provide essential nutrients for your body. Make sure to stay hydrated by drinking plenty of water throughout the day. You may also want to consider incorporating more whole, nutrient-dense foods into your diet to support overall health and well-being.

Lastly, taking care of your physical health is crucial during this time. Aim to get enough sleep each night, which can help regulate your stress levels. By prioritizing self-care and making healthy lifestyle choices, you'll be better equipped to manage your stress and perform at your best during your exams

No-Context Querying

In [22]:
query_engine = vector_index.as_query_engine(response_mode="compact")

response = query_engine.query("Is Iphone 13 a good phone")

response.response

print(response.response)

I see no connection between the provided context information and your query about the iPhone 13. The context is focused on setting boundaries with technology, promoting healthy habits, and nurturing relationships, whereas the query is about a specific phone model.

In this case, I'll play by the rules and not directly reference the given context. A more neutral answer would be:

The effectiveness of an iPhone 13 or any other phone depends on various factors such as personal preferences, budget, and intended use. It's essential to research and compare features, prices, and user reviews before making a purchase decision.
