# YOUR PROJECT TITLE

> **Note the following:** 
> 1. This is *not* meant to be an example of an actual **data analysis project**, just an example of how to structure such a project.
> 1. Remember the general advice on structuring and commenting your code
> 1. The `dataproject.py` file includes a function which can be used multiple times in this notebook.

Imports and set magics:

In [1]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import ipywidgets as widgets
import yfinance as yf

# autoreload modules when code is run
%load_ext autoreload
%autoreload 2

# user written modules
import dataprojectyf as dp


# Read and clean data

Import your data, either through an API or manually, and load it. 

## Explore each data set

In order to be able to **explore the raw data**, you may provide **static** and **interactive plots** to show important developments 

**Interactive plot** :

In [19]:
# a. fetch the data for the house representatives
house_raw = dp.fetch_data(mode="house", print_df = True)

# b. fech example data from yfinance
example_yf = yf.download("AAPL", "2022-03-01", "2022-03-31", interval="1d")

display(example_yf)



request successful


Unnamed: 0,date,ticker,amount,action,representative,party,description
0,2021-09-27,BP,"$1,001 - $15,000",purchase,Virginia Foxx,Republican,BP plc
1,2021-09-13,XOM,"$1,001 - $15,000",purchase,Virginia Foxx,Republican,Exxon Mobil Corporation
2,2021-09-10,ILPT,"$15,001 - $50,000",purchase,Virginia Foxx,Republican,Industrial Logistics Properties Trust - Common...
3,2021-09-28,PM,"$15,001 - $50,000",purchase,Virginia Foxx,Republican,Phillip Morris International Inc
4,2021-09-17,BLK,"$1,001 - $15,000",sale_partial,Alan S. Lowenthal,Democrat,BlackRock Inc
...,...,...,...,...,...,...,...
16943,2020-04-09,SWK,"$1,001 - $15,000",sale_partial,Ed Perlmutter,Democrat,"Stanley Black & Decker, Inc."
16944,2020-04-09,USB,"$1,001 - $15,000",sale_partial,Ed Perlmutter,Democrat,U.S. Bancorp
16945,2020-03-13,BMY,"$100,001 - $250,000",sale_full,Van Taylor,Republican,Bristol-Myers Squibb Company
16946,2020-03-13,LLY,"$500,001 - $1,000,000",sale_full,Van Taylor,Republican,Eli Lilly and Company


[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-03-01,164.699997,166.600006,161.970001,163.199997,162.217346,83474400
2022-03-02,164.389999,167.360001,162.949997,166.559998,165.557098,79724800
2022-03-03,168.470001,168.910004,165.550003,166.229996,165.22908,76678400
2022-03-04,164.490005,165.550003,162.100006,163.169998,162.187515,83737200
2022-03-07,163.360001,165.020004,159.039993,159.300003,158.340836,96418800
2022-03-08,158.820007,162.880005,155.800003,157.440002,156.49202,131148300
2022-03-09,161.479996,163.410004,159.410004,162.949997,161.968857,91454900
2022-03-10,160.199997,160.389999,155.979996,158.520004,157.565521,105342000
2022-03-11,158.929993,159.279999,154.5,154.729996,153.79834,96970100
2022-03-14,151.449997,154.119995,150.100006,150.619995,149.713074,108732100


# Merge data sets

In [None]:
# b. clean the data
house_clean = dp.clean_data(house_raw, print_df = True)

In [7]:
house_clean = dp.average_amount(house_clean)
nancy = dp.select_rep(house_clean, "Nancy Pelosi", print_df = True)


Unnamed: 0,date,ticker,action,representative,party,description,min_amount,max_amount,avg_amount
0,2020-08-07,FB,sale_full,Nancy Pelosi,dem,"Facebook, Inc. - Class A",1000001.0,5000000.0,3000000.5
1,2021-03-10,RBLX,purchase,Nancy Pelosi,dem,Roblox Corporation - purchase 10K shares,500001.0,1000000.0,750000.5
2,2021-02-18,AB,purchase,Nancy Pelosi,dem,"AllianceBerstein Holding LP. Purchased 15,000 ...",500001.0,1000000.0,750000.5
3,2021-02-23,AB,purchase,Nancy Pelosi,dem,"AllianceBerstein Holding LP. Purchased 25,000 ...",500001.0,1000000.0,750000.5
4,2020-06-24,AXP,purchase,Nancy Pelosi,dem,American Express Company,100001.0,250000.0,175000.5
5,2020-06-18,AAPL,sale_full,Nancy Pelosi,dem,Apple Inc.,1000001.0,5000000.0,3000000.5
6,2020-06-18,NFLX,purchase,Nancy Pelosi,dem,"Netflix, Inc.",1000001.0,5000000.0,3000000.5
7,2020-06-12,PYPL,purchase,Nancy Pelosi,dem,"PayPal Holdings, Inc.",500001.0,1000000.0,750000.5
8,2020-06-12,PYPL,purchase,Nancy Pelosi,dem,"PayPal Holdings, Inc.",500001.0,1000000.0,750000.5
9,2020-06-24,PYPL,purchase,Nancy Pelosi,dem,"PayPal Holdings, Inc.",250001.0,500000.0,375000.5


Explain what you see when moving elements of the interactive plot around. 

In [16]:
stocks_price = dp.get_stock_data(nancy)

[*********************100%***********************]  22 of 22 completed

2 Failed downloads:
- FB: No timezone found, symbol may be delisted
- WORK: No timezone found, symbol may be delisted


In [17]:
display(stocks_price)

Unnamed: 0_level_0,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,...,Volume,Volume,Volume,Volume,Volume,Volume,Volume,Volume,Volume,Volume
Unnamed: 0_level_1,AAPL,AB,AMZN,AXP,CRM,CRWD,DIS,FB,GOOG,GOOGL,...,MSFT,MU,NFLX,NVDA,PYPL,RBLX,TSLA,V,WBD,WORK
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2020-02-20,78.513992,27.180937,107.654999,130.456650,193.360001,63.380001,140.369995,,75.907501,75.849503,...,36862400,20797100,4079400,81005200,7212900,,264523500,8531900,5705100,
2020-02-21,76.736816,26.504711,104.798500,128.851990,189.500000,60.930000,138.970001,,74.255501,74.172997,...,48572600,27038900,3930100,76818000,5894500,,214722000,9231500,4574600,
2020-02-24,73.091782,25.665266,100.464500,122.442818,185.940002,58.080002,133.009995,,71.079498,70.992996,...,68311100,33321100,6936400,85691600,10139300,,227883000,13316800,7157200,
2020-02-25,70.616013,25.144499,98.637001,115.479645,181.270004,57.740002,128.190002,,69.422501,69.316002,...,68073300,41894200,6481200,105549600,13636100,,259357500,18539000,8349800,
2020-02-26,71.736229,24.911320,98.979500,113.187256,178.869995,58.459999,123.360001,,69.658997,69.523499,...,56206100,28075600,8934100,74773200,10301800,,211282500,14214700,6230400,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022-12-22,132.028412,34.223328,83.790001,144.273575,129.190002,103.970001,86.669998,,88.260002,87.760002,...,28651700,40959600,7856200,56504500,16488200,12842900.0,210090300,5690300,28566000,
2022-12-23,131.658981,33.682755,85.250000,145.971390,129.440002,101.989998,88.010002,,89.809998,89.230003,...,21207000,17425700,4251100,34932600,9990400,10774500.0,166989700,3246000,19035100,
2022-12-27,129.831772,33.496014,83.040001,145.345886,130.660004,100.660004,86.370003,,87.930000,87.389999,...,16688600,15159100,5778100,46490200,10323800,11084000.0,208643400,2904900,22207700,
2022-12-28,125.847855,32.562290,81.820000,142.982864,128.470001,99.959999,84.169998,,86.459999,86.019997,...,17457100,12518700,5964400,35106600,8897600,9948900.0,221070500,3139200,22506100,


# Analysis

# Conclusion