# Data and Code Management

## Flask API Tutorial
This tutorial will guide you through building a simple Flask API in Python. 

Flask library (install using pip install Flask)

Learning Objectives:

- Create a Flask application
- Define routes for API endpoints
- Handle different HTTP methods (GET, POST)
- Return JSON data
- Implement basic error handling

In [1]:
pip install flask

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


This cell imports the Flask class and creates an instance of a Flask application named app. The __name__ variable refers to the current Python module. Setting debug=True enables Flask's debugger, which helps identify errors during development.

It then defines a route for the root URL (/) using the @app.route decorator. The function hello_world will be executed when a GET request is made to this route. The function returns a string message, which will be the response sent back to the client.



In [2]:
from flask import Flask

app = Flask("First app")

In [3]:


@app.route("/")
def hello_world():
    return "<p>Hello, World!</p>"




In [4]:
# Run the Flask development server (optional for this cell)
app.run(host='localhost', port=5005)

 * Serving Flask app 'First app'
 * Debug mode: off


 * Running on http://localhost:5005
Press CTRL+C to quit
127.0.0.1 - - [14/Dec/2024 14:19:15] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [14/Dec/2024 14:19:15] "GET /favicon.ico HTTP/1.1" 404 -


This cell demonstrates how to create routes that accept variables. The <name> part in the URL pattern (/hello/<name>) is a placeholder that will be replaced with the actual value provided in the URL. The hello_name function receives this value as an argument named name. We use f-strings to dynamically create a greeting message that includes the provided name.

In [127]:
from flask import Flask

app = Flask("First app")

@app.route('/hello/<name>')
def hello_name(name):
  # Use the `name` variable from the URL
  return f"Hello, {name}!"


In [128]:
app.run(host='localhost', port=5006)
#go to http://localhost:5006/hello/John

 * Serving Flask app 'First app'
 * Debug mode: off


 * Running on http://localhost:5006
Press CTRL+C to quit


This cell introduces handling different HTTP methods (GET and POST). When the user visits /form with a GET request, the first section of the form function is executed. Here, we return an HTML form that allows the user to enter their name in a text input field and submit it.

When the user submits the form, a POST request is sent to the same URL (/form). 

In [23]:
from flask import Flask, request


app = Flask("First app")

@app.route('/form', methods=['GET', 'POST'])
def form():
    if request.method == 'GET':
        # Display a form for the user to input their name
        return '''<form method="POST">
                    <label for="username">Enter your name:</label>
                    <input type="text" name="username" id="username" placeholder="Enter your name">
                    <button type="submit">Submit</button>
                  </form>'''
    else:
        # Process the submitted form data
        try:
            username = request.form['username']
            return f"Hello, {username}!"
        except KeyError:
            # Handle the case where the 'username' key is not present in the form data
            return "Please enter your name in the form."


In [24]:
app.run(host='localhost', port=5006)

 * Serving Flask app 'First app'
 * Debug mode: off


 * Running on http://localhost:5006
Press CTRL+C to quit
127.0.0.1 - - [14/Dec/2024 15:10:45] "GET /form HTTP/1.1" 200 -
127.0.0.1 - - [14/Dec/2024 15:10:51] "POST /form HTTP/1.1" 200 -


This cell introduces templating with Jinja2, a powerful engine used by Flask for rendering dynamic HTML content. We import render_template from Flask. The @app.route('/') decorator is used again for the root path.

- We create a dictionary named data containing variables to be used in the template.
- The render_template function takes two arguments:
    - The template filename ('index.html')
    - A dictionary (data) containing variables to be passed to the template (context)

In [12]:
from flask import Flask, render_template

app = Flask("First app")

@app.route('/')
def index():
    # Create a dictionary to pass data to the template
    data = {'title': 'My Flask App', 'message': 'Welcome!'}
    return render_template('index.html', data=data)


In [13]:
app.run(host='localhost', port=5001)


 * Serving Flask app 'First app'
 * Debug mode: off


 * Running on http://localhost:5001
Press CTRL+C to quit
127.0.0.1 - - [09/Dec/2024 20:10:55] "GET / HTTP/1.1" 200 -


In [14]:
@app.route('/users')
def users():
    # Create a list of users
    users = ['Alice', 'Bob', 'Charlie']
    return render_template('users.html', users=users)


In [15]:
app.run(host='localhost', port=5001)


 * Serving Flask app 'First app'
 * Debug mode: off


 * Running on http://localhost:5001
Press CTRL+C to quit
127.0.0.1 - - [09/Dec/2024 20:11:06] "GET / HTTP/1.1" 200 -


# Assignment 

You were introduced to the Python data types, structures, frames, formats, such as JSON and CSV, and interface. You have also been introduced to the Python Flask package for building web pages. The purpose of this assignment is to reinforce and assess your learning of the mentioned Python features through practice.  

## Question 1: Read the CSV file into a Pandas data frame and drop records where company names are None values:
- Print the count of records (rows) in the resulting data frame
- Print the first 5 records of the data frame


In [5]:
import pandas as pd
df = pd.read_csv(r"C:\Users\annem\Desktop\Lesson 10\templates\crunchbase_odm_orgs.csv")
lengthNoDrops = len(df)
cleaned = df.dropna(subset=['name'])
lengthDropped = len(cleaned)

# Print the count of records in the resulting data frame
if lengthDropped == lengthNoDrops:
    print("No missing company names, no records dropped.")
    print(f'Count of records: {lengthDropped}')
else:
    print(f'Dropped {lengthNoDrops-lengthDropped} records.')
    print(f'Count of records: {lengthDropped}')

# Print the first 5 records of the data frame    
df.head()

No missing company names, no records dropped.
Count of records: 9999


Unnamed: 0,uuid,name,type,primary_role,cb_url,domain,homepage_url,logo_url,facebook_url,twitter_url,linkedin_url,combined_stock_symbols,city,region,country_code,short_description
0,e1393508-30ea-8a36-3f96-dd3226033abd,Wetpaint,organization,company,https://www.crunchbase.com/organization/wetpai...,wetpaint.com,http://www.wetpaint.com/,https://res.cloudinary.com/crunchbase-producti...,https://www.facebook.com/Wetpaint,https://twitter.com/wetpainttv,https://www.linkedin.com/company/wetpaint,,New York,New York,USA,Wetpaint offers an online social publishing pl...
1,bf4d7b0e-b34d-2fd8-d292-6049c4f7efc7,Zoho,organization,company,https://www.crunchbase.com/organization/zoho?u...,zoho.com,https://www.zoho.com/,https://res.cloudinary.com/crunchbase-producti...,http://www.facebook.com/zoho,http://twitter.com/zoho,http://www.linkedin.com/company/zoho-corporati...,,Pleasanton,California,USA,"Zoho offers a suite of business, collaboration..."
2,5f2b40b8-d1b3-d323-d81a-b7a8e89553d0,Digg,organization,company,https://www.crunchbase.com/organization/digg?u...,digg.com,http://www.digg.com,https://res.cloudinary.com/crunchbase-producti...,http://www.facebook.com/digg,http://twitter.com/digg,http://www.linkedin.com/company/digg,,New York,New York,USA,Digg Inc. operates a website that enables its ...
3,df662812-7f97-0b43-9d3e-12f64f504fbb,Facebook,organization,company,https://www.crunchbase.com/organization/facebo...,facebook.com,http://www.facebook.com,https://res.cloudinary.com/crunchbase-producti...,https://www.facebook.com/facebook/,https://twitter.com/facebook,http://www.linkedin.com/company/facebook,nasdaq:FB,Menlo Park,California,USA,Facebook is an online social networking servic...
4,b08efc27-da40-505a-6f9d-c9e14247bf36,Accel,organization,investor,https://www.crunchbase.com/organization/accel?...,accel.com,http://www.accel.com,https://res.cloudinary.com/crunchbase-producti...,http://www.facebook.com/accel,http://twitter.com/accel,https://www.linkedin.com/company/accel-vc/,,Palo Alto,California,USA,Accel is an early and growth-stage venture cap...


## Question 2: Create a data frame that contains only the records of USA-based companies whose name starts with "Ac":
- Print the count of records (rows) in the resulting data frame
- Print the first 5 records of the data frame


In [6]:
# Make the company names all lowercase to make filtering easier
df['name'] = df['name'].str.lower()

# Get the USA-based companies whose name starts with ac
df_usa_ac = df[(df['country_code'] == 'USA') & (df['name'].str.startswith('ac'))]
lengthFiltered = len(df_usa_ac)

#Print the count of records in the resulting data frame
print(f'Count of records: {lengthFiltered}')

# Print the first 5 records of the data frame
df_usa_ac.head()

Count of records: 35


Unnamed: 0,uuid,name,type,primary_role,cb_url,domain,homepage_url,logo_url,facebook_url,twitter_url,linkedin_url,combined_stock_symbols,city,region,country_code,short_description
4,b08efc27-da40-505a-6f9d-c9e14247bf36,accel,organization,investor,https://www.crunchbase.com/organization/accel?...,accel.com,http://www.accel.com,https://res.cloudinary.com/crunchbase-producti...,http://www.facebook.com/accel,http://twitter.com/accel,https://www.linkedin.com/company/accel-vc/,,Palo Alto,California,USA,Accel is an early and growth-stage venture cap...
186,ed54b2d5-f2f1-d4e9-9bbe-40961ff08d44,action engine,organization,company,https://www.crunchbase.com/organization/action...,actionengine.com,http://www.actionengine.com,https://res.cloudinary.com/crunchbase-producti...,,,https://www.linkedin.com/company/actionengine,,Foster City,California,USA,Action Engine is an agile software development...
278,64a76f34-4e68-2eb6-d943-9465f39155cd,activeworlds,organization,company,https://www.crunchbase.com/organization/active...,activeworlds.com,http://www.activeworlds.com,https://res.cloudinary.com/crunchbase-producti...,http://www.facebook.com/activeworlds3d,http://twitter.com/activeworlds3d,,,Las Vegas,Nevada,USA,Active Worlds is a 3D virtual reality platform...
1244,ef598b5b-d588-c21f-9b93-be78443f4e22,acquia,organization,company,https://www.crunchbase.com/organization/acquia...,acquia.com,http://acquia.com,https://res.cloudinary.com/crunchbase-producti...,http://www.facebook.com/acquia,http://twitter.com/Acquia,http://www.linkedin.com/company/167056,,Boston,Massachusetts,USA,Acquia specializes in providing cloud-based di...
1879,135f427d-7f25-9492-1ffe-801b35b6b988,academic capital exchange,organization,company,https://www.crunchbase.com/organization/academ...,academicapital.com,http://www.academicapital.com,https://res.cloudinary.com/crunchbase-producti...,,,,,Chicago,Illinois,USA,"Academic Capital Exchange (ACE), a peer-to-pee..."


## Question 3: Convert the data frame from the previous step into a list of JSON objects:
- Print the count of JSON objects
- Print the first 5 JSON objects
- Write all JSON objects into a text (JSON string) file
- Print the number of records in the resulting file


In [7]:
# Convert the data frame into a list of JSON objects
import json
jsonStr = df_usa_ac.to_json(orient='records') # Data Frame to JSON string
jsonList = json.loads(jsonStr) # JSON string to list of objects

# Print the count of JSON objects
print(f'Count of objects: {len(jsonList)}')

# Print the first 5 JSON objects
print(jsonList[0:5])

# Write all JSON objects into a text file. 
with open(r"C:\Users\annem\Desktop\Lesson 10\Lesson10textFile.txt", "w") as fi:
    text = json.dumps(jsonList)
    fi.write(text)
fi.close()
# Print the number of records in the resulting file.
print(f'The number of records in this file: {len(jsonList)}')

Count of objects: 35
[{'uuid': 'b08efc27-da40-505a-6f9d-c9e14247bf36', 'name': 'accel', 'type': 'organization', 'primary_role': 'investor', 'cb_url': 'https://www.crunchbase.com/organization/accel?utm_source=crunchbase&utm_medium=export&utm_campaign=odm_csv', 'domain': 'accel.com', 'homepage_url': 'http://www.accel.com', 'logo_url': 'https://res.cloudinary.com/crunchbase-production/image/upload/kxcwecxf439wsgluv7jv', 'facebook_url': 'http://www.facebook.com/accel', 'twitter_url': 'http://twitter.com/accel', 'linkedin_url': 'https://www.linkedin.com/company/accel-vc/', 'combined_stock_symbols': None, 'city': 'Palo Alto', 'region': 'California', 'country_code': 'USA', 'short_description': 'Accel is an early and growth-stage venture capital firm that powers a global community of entrepreneurs.'}, {'uuid': 'ed54b2d5-f2f1-d4e9-9bbe-40961ff08d44', 'name': 'action engine', 'type': 'organization', 'primary_role': 'company', 'cb_url': 'https://www.crunchbase.com/organization/actionengine?utm_so

## Question 4: Read the JSON objects from the created file back into a data frame, filter for companies based in New York (city):
- Print the records in the resulting data frame
- Write the output to a webpage using Flask

In [8]:
from flask import Flask, render_template

# Read the JSON objects from the created file back into a data frame
jsonDict = []
with open(r"C:\Users\annem\Desktop\Lesson 10\Lesson10textFile.txt", "r") as fi:
    content = fi.read()
    jsonDict = json.loads(content) # JSON string to list of dictionaries
fi.close()
dframe = pd.DataFrame.from_dict(jsonDict) # list of dictionaries to data frame

# Filter for companies based in NYC
df_nyc_ac = dframe[(dframe['city'] == 'New York')]
lengthNYC = len(df_nyc_ac)

# Print the records in the resulting data frame
display(df_nyc_ac)

# Write the output to a webpage using Flask
app = Flask("Question 4")

@app.route('/')
def index():
    return df_nyc_ac.to_html(header=True, render_links=True)

Unnamed: 0,uuid,name,type,primary_role,cb_url,domain,homepage_url,logo_url,facebook_url,twitter_url,linkedin_url,combined_stock_symbols,city,region,country_code,short_description
25,41d88d51-b45f-83c8-341a-957027e036f7,activecause,organization,company,https://www.crunchbase.com/organization/active...,activecause.com,http://activecause.com,https://res.cloudinary.com/crunchbase-producti...,,http://twitter.com/activecause,https://www.linkedin.com/in/hankejh,,New York,New York,USA,"ActiveCause brings together nonprofits, corpor..."
32,3b7479b1-0908-7fa0-01d2-3a3db9f5036f,acerno,organization,company,https://www.crunchbase.com/organization/acerno...,acerno.com,http://www.acerno.com,https://res.cloudinary.com/crunchbase-producti...,,,,,New York,New York,USA,Acerno is an e-commerce advertising company th...


In [9]:
app.run(host='localhost', port=5001)

 * Serving Flask app 'Question 4'
 * Debug mode: off


 * Running on http://localhost:5001
Press CTRL+C to quit
127.0.0.1 - - [14/Dec/2024 14:20:09] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [14/Dec/2024 14:20:10] "GET /favicon.ico HTTP/1.1" 404 -


## Question 5: Implement Flask API functions to Return a list of Crunchbase companies based in New York
- Take a user input through an API endpoint to return a list of JSON objects from Webhose file where the "title" fields contain the query string 

In [27]:
# Get the Webhose file (the provided csv) into a data frame for retreival
dfQ5 = pd.read_csv(r"C:\Users\annem\Desktop\Lesson 10\templates\crunchbase_odm_orgs.csv")

from flask import Flask, request
app = Flask("Question 5")

@app.route('/form', methods=['GET', 'POST'])
def form():
    if request.method == 'GET':
        # Display a form for the user to input the City name
        return '''<form method="POST">
                    <label for="city">Enter City:</label>
                    <input type="text" name="city" id="city" placeholder="Enter the City">
                    <button type="submit">Submit</button>
                  </form>'''
    else:
        # Process the submitted form data
        try:
            city = request.form['city']
            df_filtered = dfQ5[(dfQ5['city'] == city)]
            return jsonify(df_filtered)
        except KeyError:
            # Handle the case where the 'city' key is not present in the data
            return "Please enter the city in the form."

In [28]:
app.run(host='localhost', port=5002)

 * Serving Flask app 'Question 5'
 * Debug mode: off


 * Running on http://localhost:5002
Press CTRL+C to quit
127.0.0.1 - - [14/Dec/2024 15:12:05] "GET /form HTTP/1.1" 200 -
[2024-12-14 15:12:09,149] ERROR in app: Exception on /form [POST]
Traceback (most recent call last):
  File "C:\Users\annem\AppData\Roaming\Python\Python39\site-packages\flask\app.py", line 1511, in wsgi_app
    response = self.full_dispatch_request()
  File "C:\Users\annem\AppData\Roaming\Python\Python39\site-packages\flask\app.py", line 919, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "C:\Users\annem\AppData\Roaming\Python\Python39\site-packages\flask\app.py", line 917, in full_dispatch_request
    rv = self.dispatch_request()
  File "C:\Users\annem\AppData\Roaming\Python\Python39\site-packages\flask\app.py", line 902, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
  File "C:\Users\annem\AppData\Local\Temp\ipykernel_43704\2934559927.py", line 21, in form
    

## Question 6: Git Version Control

- Initialize a Git repository for your project directory.
- Commit your code (data.py, Flask application script) with a descriptive message (e.g., "Initial commit: Flask API with data and route").
- Push your code to a remote Git repository (e.g., GitHub). Submit the repository URL in the answer cell.


In [None]:
#Write git repo link

# Assignment End