# Step 1
I thought it would be nice if I actually did something that I could use. I have been planning to start going to the gym for a while and I wanted to use this assignment as a means to achieve that goal. So, I found a website that had a library of different exercises where each exercise has a title, what equipment is needed for it, its level (beginner, intermediate, advanced), and which muslce groups it aims to target. Since I'm a beginner, I thought collecting all exercise types that are for beginners with the remaining information about them and building a DataFrame that holds that information would be nice. This is the website I found:https://www.acefitness.org/resources/everyone/exercise-library/
And this is ther specific URL that only shows exercise types that are for beginners: https://www.acefitness.org/resources/everyone/exercise-library/experience/beginner/

# Step 2
Install necessary Python libraries (requests, Scrapy, etc.)

In [1]:
# Import a scrapy Selector
from scrapy import Selector

# Import requests
import requests

# So I introduce some lags while I scrape information
# from different subpages to not get banned :)
import time


# Step 2 (cont.)
I looked at the xpaths for titles, equipment information, and muscle groups separately for a couple of examples to see what kind of pattern there is:

### Titles
/html/body/div[4]/div[2]/div/main/div[3]/section/div[1]/div[1]/a/div[2]/header/h2
/html/body/div[4]/div[2]/div/main/div[3]/section/div[4]/div[2]/a/div[2]/header/h2


### Equipment information
/html/body/div[4]/div[2]/div/main/div[3]/section/**div[1]/div[1]**/a/div[2]/div/dl/div[2]/dd
/html/body/div[4]/div[2]/div/main/div[3]/section/**div[4]/div[2]**/a/div[2]/div/dl/div[2]/dd



### Muscles groups
/html/body/div[4]/div[2]/div/main/div[3]/section/**div[1]/div[1]**/a/div[2]/div/dl/div[1]/dd
/html/body/div[4]/div[2]/div/main/div[3]/section/**div[4]/div[2]**/a/div[2]/div/dl/div[1]/dd

---

It seems like the website is organized in such a way that the parts I bolded correspond to row x column information on the website for each exercise type. Since I do not want a specific exercise in a specific location, I use the asterisk wildcard to collect all exercises in a page. Here are the xpaths:

**Titles:** /html/body/div[4]/div[2]/div/main/div[3]/section/**div[\*]/div[\*]**/a/div[2]/header/h2

**Equipment information:** /html/body/div[4]/div[2]/div/main/div[3]/section/**div[\*]/div[\*]**/a/div[2]/div/dl/div[2]/dd

**Muscle groups:**
/html/body/div[4]/div[2]/div/main/div[3]/section/**div[\*]/div[\*]**/a/div[2]/div/dl/div[1]/dd

---
I also checked how many different pages there are that go along for these exercises and it seems that the number is 10. So I will create a dynamic for loop to go into 10 different subpages and collect information from all of it.


# Step 3

In [2]:
# Define the root URL
root_url = (f'https://www.acefitness.org/resources/everyone/'
       f'exercise-library/experience/beginner/?page=')


# Define the xpaths
xpath_for_body_area ='/html/body/div[4]/div[2]/div/main/div[3]/section/div[*]/div[*]/a/div[2]/div/dl/div[1]/dd/text()'
xpath_for_title ='/html/body/div[4]/div[2]/div/main/div[3]/section/div[*]/div[*]/a/div[2]/header/h2/text()'
xpath_for_equipment ='/html/body/div[4]/div[2]/div/main/div[3]/section/div[*]/div[*]/a/div[2]/div/dl/div[2]/dd/text()'

# Define empty data structures for later use
zip_list = []
fitness_dictionary = {}

# Enter a for loop for how many different pages there are starting from 1
for page_number in range(1,11):
    
    url_for_page = f"{root_url}{page_number}"  # Specify the page for the iteration

    html = requests.get(url_for_page).content  # Get the content for this page

    sel = Selector(text = html)  # Select the content for this page


    # Bundle the information together for each exercise and store it in a list
    zip_object = zip((sel.xpath(xpath_for_title).extract()),
             (sel.xpath(xpath_for_body_area).extract()),
             (sel.xpath(xpath_for_equipment).extract()))
    zip_list.append(zip_object)

    # Introduce a lag just in case to not get banned :)
    time.sleep(10.0)

# Turn your zipped objects into a dictionary to convert it to a dataframe
for zip_object in zip_list:
       for title, body_area, equipment in zip_object:
              fitness_dictionary[title] = [body_area, equipment]


# Step 4 & 5

In [4]:
import pandas as pd

fitness_df = pd.DataFrame(fitness_dictionary)
fitness_df= fitness_df.T


fitness_df = fitness_df.rename(columns={0: 'Body Part', 1: 'Equipment'})
fitness_df = fitness_df.rename_axis('Exercise Name')

# DataFrame & some useful code bits

In [24]:
fitness_df.to_csv("beginners_exercises.csv")

# Printing out the dataframe yielded a visually very unappealing output so
# viewing it in a csv file is much better


print("---\nAbs exercises:") # For better readability of the output

# Print out exercises that have aim at strengthening abs
print(fitness_df[fitness_df['Body Part'].str.contains("Abs", case=False)].index)

print("---\nResistance band exercises:") # For better readability of the output

# Print out exercises that use resistance bands
print(fitness_df[fitness_df['Equipment'].str.contains("Resistance bands", case=False)].index)

print("---\nArms & No equipment exercises:") # For better readability of the output

# Print out exercises that target arms which require no equipment
print(fitness_df[
          fitness_df['Body Part'].str.contains("Arms", case=False) &
          fitness_df['Equipment'].str.contains("No equipment", case=False)
      ].index
      )


---
Abs exercises:
Index(['Bodyweight Squat', 'Childs Pose', 'Cobra Exercise', 'Crunch',
       'Decline Plank', 'Dirty Dog', 'Forward Stepping over Cones ',
       'Glute Bridge Exercise', 'Half-kneeling Hay Baler', 'Kneeling ABC's',
       'Partner Assisted Bodyweight Squats', 'Partner Tricep Extension',
       'Quadruped Bent-knee Hip Extensions', 'Seated Crunch',
       'Seated Medicine Ball Trunk Rotations', 'Seated Side-Straddle Stretch ',
       'Side Plank - modified', 'Single Leg Stand',
       'Stability Ball Sit-ups / Crunches', 'Standing Crunch',
       'Supine Dead Bug', 'Supine Hollowing with Lower Extremity Movements',
       'Supine Pelvic Tilts', 'Supine Reverse Marches', 'Supine Rotator Cuff ',
       'Supine Snow Angel (Wipers) Exercise', 'Upward Facing Dog',
       'V Sit Partner Rotations'],
      dtype='object', name='Exercise Name')
---
Resistance band exercises:
Index(['Ankle Flexion ', 'Anti-rotation Reverse Lunge',
       'Partner Tricep Extension', 'Prone (Ly