### Scrape Sports-Reference CFB Stats for a Single Season (2015)

This cell downloads NCAA player statistics from Sports-Reference for the **2015** season only.

Stat categories scraped:
- **passing**
- **rushing**
- **receiving**

For each stat type, the script:
1. Builds the URL: `https://www.sports-reference.com/cfb/years/{year}-{stat}.html`  
2. Requests the page  
3. Loads the first stats table using `pandas.read_html`  
4. Saves it as a CSV named:  
   **`{year}_{stat}.csv`**  
5. Pauses 1 second between requests to avoid rate-limiting  

You will see confirmation messages for successful saves, or error messages if scraping fails.


In [None]:
import requests
import pandas as pd
import time

years = list(range(2015, 2016))
stats = ["passing", "rushing", "receiving"]

for year in years:
    for stat in stats:
        url = f"https://www.sports-reference.com/cfb/years/{year}-{stat}.html"
        try:
            response = requests.get(url)
            response.raise_for_status()
            dfs = pd.read_html(response.text)
            if dfs:
                df = dfs[0]
                df.to_csv(f"{year}_{stat}.csv", index=False)
                print(f"Saved {year}_{stat}.csv")
            else:
                print(f"No tables found for {url}")
        except Exception as e:
            print(f"Failed for {url}: {e}")
        time.sleep(1)  # be polite to the server

  dfs = pd.read_html(response.text)


Saved 2015_passing.csv


  dfs = pd.read_html(response.text)


Saved 2015_rushing.csv


  dfs = pd.read_html(response.text)


Saved 2015_receiving.csv


### Scrape Sports-Reference CFB Stats (Passing / Rushing / Receiving)

This cell downloads NCAA player statistics directly from Sports-Reference for the
selected years and stat types.

- Years scraped: **2016â€“2024**
- Stat pages scraped:
  - `passing`
  - `rushing`
  - `receiving`
- Each page is saved locally as:
  **`{year}_{stat}.csv`**

The script:
1. Builds the correct Sports-Reference URL  
2. Requests the HTML page  
3. Extracts the first table using `pandas.read_html`  
4. Saves the output as a CSV  
5. Sleeps 1 second between requests to avoid rate-limiting  

You will see a success message for each file saved or an error message if the page fails.


In [None]:
import requests
import pandas as pd
import time

years = list(range(2016, 2025))
stats = ["passing", "rushing", "receiving"]

for year in years:
    for stat in stats:
        url = f"https://www.sports-reference.com/cfb/years/{year}-{stat}.html"
        try:
            response = requests.get(url)
            response.raise_for_status()
            dfs = pd.read_html(response.text)
            if dfs:
                df = dfs[0]
                df.to_csv(f"{year}_{stat}.csv", index=False)
                print(f"Saved {year}_{stat}.csv")
            else:
                print(f"No tables found for {url}")
        except Exception as e:
            print(f"Failed for {url}: {e}")
        time.sleep(1)  # be polite to the server