# Week 9 
## In Class practice: files and the internet. 

Complete the tasks below, some will be in this notebook, others will need to be .py files. If you run .py files as part of this work, turn them in AND include screenshots in this notebook showing that you ran the code, results, how you modified it, etc. This Jupyter notebook should tell the story of the work that you have done, just like a lab notebook should.

1. File format practice. Create a csv file (any way you would like, your choice), open it with Python, write some data to it, save it and close it. Re-open with python to see that your changes were made in the .csv file. Put this code in some cells below. When you turn in this notebook, also turn in the csv file that you have created.

2. Requests and BeautifulSoup. Below, write a few sentences describing what these are. Then, try them out on another website besides cnn.com. Modify the code below and display your results.

In [16]:
import requests
from bs4 import BeautifulSoup
# if you need to install these you can use pip install <nameofmodule>, they are probably already in Colab

#Requests is a Python library that simplifies making HTTP requests to web servers. It allows you to send GET, POST, and other types of requests to fetch web content.

#BeautifulSoup is a library for parsing HTML and XML documents. It creates a parse tree from page source code that can be used to extract data. It makes it easy to search, navigate, and modify the parse tree, making it perfect for web scraping.

import csv

# Create and write to CSV
with open('sample_data.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerow(['Name', 'Age', 'City'])
    writer.writerow(['John', 25, 'New York'])
    writer.writerow(['Alice', 30, 'Chicago'])
    writer.writerow(['Bob', 35, 'Los Angeles'])

# Read from CSV to verify
with open('sample_data.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)



['Name', 'Age', 'City']
['John', '25', 'New York']
['Alice', '30', 'Chicago']
['Bob', '35', 'Los Angeles']


In [17]:
#checking to see that beautifulSoup works
html = "<html><body><h1>Hello, World!</h1></body></html>"
soup = BeautifulSoup(html, "html.parser")

print(soup.h1.text)  # Output: Hello, World!

Hello, World!


3. Webscraping with requests and BeautifulSoup

In [18]:
# Used a different website

# Fetch Wikipedia's main page
url = "https://www.wikipedia.org"
response = requests.get(url)
html_content = response.text

# Parse the HTML
soup = BeautifulSoup(html_content, "html.parser")

# Extract specific content
title = soup.find("title").text
print(f"Page title: {title}")

Page title: Wikipedia


3a. Try to extract another piece of data from a website- can you look through the source code in the html and see if there is another section of the website you chose for soup to find? I understand some of you have not learned html, but, you need to learn a bit of everything regardless! Print it to the screen and put your code and the result in a cell below.

3b. Building Web APIs with Flask (somewhat advanced level)
Description:
Flask is a lightweight framework for building web applications and REST APIs. Read some documentation on it (and what is a REST API?)
And make sure you can get the code below to run and display the message in your browser.
Getting Started:
Install Flask: pip install flask # if you are in VS code, if in Colab you might need a ! in front of the command

In [19]:
# Extract all main page links
links = soup.find_all('a', class_='central-featured-lang')
for link in links[:5]:  # Show first 5 language options
    print(f"Language: {link.text.strip()}")

A REST (Representational State Transfer) API is an architectural style for designing networked applications. It uses HTTP requests to GET, PUT, POST, and DELETE data. RESTful APIs are stateless, meaning each request contains all necessary information, and they're designed to be simple and standardized.

In [20]:
#pip install flask


In [21]:
#Pay attention! The code below should be put in a file called app.py and run in VS Code, go to the browser where it is running
#and what do you See?
from flask import Flask, jsonify

app = Flask(__name__)

@app.route('/')
def home():
    return "Hello, Flask!"

@app.route('/api/data')
def data():
    return jsonify({"message": "Hello, API!"})

if __name__ == "__main__":
    app.run(debug=True)


 * Serving Flask app '__main__'
 * Debug mode: on


 * Running on http://127.0.0.1:5000
Press CTRL+C to quit
 * Restarting with stat


SystemExit: 1

3c. Interacting with Web APIs using requests.

The requests library makes it simple to send HTTP requests to APIs and handle responses.

In [None]:
import requests

# Fetch data from a public API
url = "https://api.coindesk.com/v1/bpi/currentprice.json"
response = requests.get(url)

# Parse JSON response
if response.status_code == 200:
    data = response.json()
    print(f"Bitcoin Price (USD): {data['bpi']['USD']['rate']}")
else:
    print(f"Failed to fetch data. Status code: {response.status_code}")


3d. Asynchronous Web Requests with aiohttp
Description:
aiohttp is an asynchronous library for making HTTP requests, allowing you to handle multiple requests concurrently.
Look up aiohttp-- what is it used for?




In [None]:
#pip install aiohttp

aiohttp is an asynchronous HTTP client/server framework for Python that uses Python's asyncio library. It's particularly useful when you need to make multiple HTTP requests concurrently, as it can handle them asynchronously (in parallel) rather than sequentially. This makes it much faster than traditional synchronous requests when dealing with multiple URLs or API endpoints.

In [None]:
#Warning- run this as a python program in VS code!
import aiohttp
import asyncio

async def fetch_url(session, url):
    async with session.get(url) as response:
        return await response.text()

async def main():
    urls = ["https://cnn.com", "https://google.com"]
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_url(session, url) for url in urls]
        responses = await asyncio.gather(*tasks)
        for response in responses:
            print(response[:100])  # Print first 100 characters of each response

asyncio.run(main())


4. You should be at the point now with coding, that you are not just doing the bare minimum, but exploring what you would like to do with code. In the space below, jot down some ideas about what you might want to do with your lab. In the past, I've had fin math majors create a tracking spreadsheet python program. I've had others scrape data from the web for various uses, and then went on to build apps from this data in future classes. What you create for the lab can be simple, but starting to think about what you can do with programming, and then what you will need to learn to build robust solutions is the goal. So, a few ideas below AND please propose a cool idea in slack of what you can do with web libraries, flask, and files. (Web 1 students: yes! you can include this in your final web project if you want!) Turn this notebook, and any associated files, into Moodle (due Thursday end of class)

Here are some potential project ideas combining web libraries, Flask, and files:

1. Stock Market Dashboard
   - Scrape real-time stock data from financial websites
   - Store historical data in CSV files
   - Create a Flask web app to display trends and analytics

2. Weather Analysis Tool
   - Fetch weather data from multiple cities using weather APIs
   - Store historical weather data in CSV files
   - Create visualizations and predictions using the collected data

3. Social Media Analytics
   - Use APIs to collect social media data
   - Analyze sentiment and trends
   - Create a Flask dashboard to display insights

4. Personal Finance Tracker
   - Create a Flask web app for expense tracking
   - Store transactions in CSV files
   - Add visualization of spending patterns
   - Include cryptocurrency price tracking

5. Web Crawler for News Aggregation
   - Scrape news from multiple sources
   - Categorize articles using NLP
   - Create a Flask app to display personalized news feed