# Analyzing Investment Decisions Using Machine Learning

In this project we will be using Machine Learning to process financial data, and use them to make investment decisions. We will mainly focus on whether or not we should buy, sell or hold a particular stock given the information that we have. 

Here we will also learn how to pull out data, process them and then perform some of the most important machine learning algorithms to analyze these data. The data that we will be using are the tickers that are available in the S&P 500. 

Since this is going to be quite a long project, we will not be making all the necessary imports at once, and instead we will be importing libraries along the way as we require them. 

First imports:

In [1]:
import bs4 as bs # Beautiful Soup
import pickle 
import requests

We will first be pulling all ticker symbols of companies available in S&P 500 using BeautifulSoup4, which turns source code from a website into a BeautifulSoup object that can be treated like a typical Python object. 

We will be extracting these ticker symbols from Wikipedia using the request.get() method. Finally, we will specify the exact table that we need to get to the actual ticker symbols that we want to extract.

In [2]:
# Request information from Wikipedia by using the web link
resp = requests.get('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')

# Extract the text using lxml (toolkit to process html and xml using Python)
soup = bs.BeautifulSoup(resp.text, 'lxml')

# Find and extract the particular table that we want
table = soup.find('table', {'class': 'wikitable sortable'})

For line 8 of the code above, it simply means that we are extracting a table that has a class name of "wikitable sortable", as can be shown below (taken from the wikipedia page source code):

In [None]:
from IPython.display import Image
Image("wikipedia.png")