# Let's Get Started

**Objectives**: Today we are going to get our development enviroments and access environments set up and introduce you to the various tools and resources we will be using in class. This will include:
  
* Setting up your environment with Anacoda which is the free Python distribution for data science we will be using
* Reviewing how to install these Jupyter notebooks on the PC your team will be using to complete your assignments using GitHub,
* Starting with our first data analysis of the classic "iris dataset"
* Review sources of help in your work with Python
* Develop an understanding of a data science workflow

When you first start the class you will most likely be viewing a static version of this page on GitHub. Once you follow the directions below, you will have Python and our required libraries running on your computer.  You can download the "live" versions of this page which will allow you to run code and complete your assignment for the week which we detail below.
      

## Setting Up Your Environment



1. Download and install Anaconda for your operating system: https://www.continuum.io/downloads

2. During install, make sure to select the "Add Anaconda to my PATH environment variable" option

3. Open a command prompt or terminal (Google how to do so if you are unsure)

4. Run the command conda --version in the command prompt or terminal and ensure you can see a version. If not, something went wrong during install and you will need to repeat step 1.

5. Run the following command based on your Operating System:
  1. Windows: conda install --channel https://conda.anaconda.org/ProfessorEaston tweepy
  2. Mac: conda install --channel https://conda.anaconda.org/zed tweepy
  3. Linux: conda install --channel https://conda.anaconda.org/ActivisionGameScience tweepy
  

Finally, check that Python installed correctly by entering the following at the command line.

````
In[0]: ipython qtconsole
````

If you installed Python correctly, you should see a console launch that looks like this:

<img src="https://raw.githubusercontent.com/azbones/big_data/master/images/week1-qtconsole.png">

Check that everything else installed correctly by cutting and pasting the code from the code block below into your IPython/Jupyter console and hitting return.

If all your packages were installed correctly, you should see **"Good to go on (your computer's name)!"**

## Code Segment to Test Your Local Development Environment

In [None]:
import pkg_resources
from distutils.version import StrictVersion
import socket

def check_version_number(module, minimum):
    """Returns True if local Python install meets class requirements"""
    try:
        module_version = pkg_resources.get_distribution(module).version
    except Exception:
        print 'You are missing the {} module.'.format(module)
        return False

    if StrictVersion(module_version) < StrictVersion(minimum):
        print ('Your version of {0} is too old at {1}! Need at least version'
               ' {2}...').format(module, module_version, minimum)
        return False

    return True

# Logical checks of each version using check_version_number function
success = check_version_number('pandas', '0.14.0')
success = check_version_number('boto', '2.29.1') and success
success = check_version_number('tweepy', '3.4.0') and success

# Provide users with results of checking for correct versions in stdout
if success:
    print 'Good to go on {}!'.format(socket.gethostname())
else:
    print ('Validation failed. You have missing or outdated modules. Please'
           ' go through the install procedures again.')

## Jupyter Notebooks

**Getting the Notebooks**

We are using Jupyter notebooks (http://jupyter.org/) to facilitate the technical portion of our class. Jupyter notebooks allow us to have an web-based, interactive Python environment were we can run code, present visualizations, and provide explantory text. We developed these notebooks using the version control system Git (https://git-scm.com/) and store them centrally at GitHub (http://github.com). The notebooks are publicly **viewable** in our repository at https://github.com/azbones/big_data.  In order to run our sample code, build your own code for the assignments, answer questions in the notebook, and ultimately turn in your assignment, you will need to download the notebooks to your computer and install them. There are two ways to do this:

* **Easy Way**: click on the "download zip" button, download the zip file, and then unzip in the directory where you will be doing your work.

* **More Elite Way**: install Git on your computer and clone the GitHub directory to the directory where you will be doing your work. A review of Git is beyond the scope of this class, but you can learn more about it and GitHub here- https://help.github.com/categories/bootcamp/ By the way, as we consider our notebooks open source, you can contribute to making them better by starting a pull request which we will evaluate and perhaps include in the source code going forward. Learn about pull requests here- https://help.github.com/articles/creating-a-pull-request/,

**Running the Notebooks**

Once you have the repository from GitHub on your local computer, open your console and navigate to that directory which should be named "big_data". Next, enter the following command (note: they recently changed the name of the notebooks from IPython Notebooks to Jupyter Notebooks given they now work in languages other than Python):

````
In[0]: ipython notebook
````

If everything works as intended, this command should start a local webserver and then direct your default web browser to the root Jupyter directory. Now, you should be able to navigate to "week_1" and launch the Jupyter notebook called "Week 1- Getting Ready- Environments, Python, Jupyter, pandas, and Credentials". 

While the web page you launch from your local machine will look identical to the one in GitHub, it is fundementally different. Now instead of a static web page, you will have an interactive page with all the power of Python!

To test your notebook, select the code block above so that there is a back outline around it and then select "Cell" and "Run" from the Jupyter menu.  You should see the output of this code right below the code block.

Now try the same with the code block below:
      

In [None]:
import this