## Data Science

_What is a Data Scientist?_: A data scientist has been jokingly defined as "a person who is better at statistics than any software engineer and better at software engineering than any statistician." Perhaps more usefully, a data scientist is someone with the programming skills to acquire and manage large data sets, the statistical skills to extract value from those sets with the scientific method, and the business skills to effectively present that value to an audience.

#### What will this fundamentals course include?
This introductory course covers basic programming fundamentals and statistics topics. You'll get an introduction to programming in Python, an overview of the math and stats behind the tools you'll use as a data scientist, and you'll begin planning for your career in data science.

#### Capstone project
This fundamentals course culminates in a real-world final capstone project that will give you a view into the professional life of a data scientist.

## Lesson 1: Introduction to Python
#### [Python Cheatsheet:](https://tf-assets-prod.s3.amazonaws.com/tf-curric/data-science/PYTHON-Basics.pdf)

Python continues to be crowd developed by PEPs or Python Enhancement Proposals. PEP8 is the agreed upon style guide for Python code.

We will be using jupyter notebooks as our IDE for this course.  The following code block suffices as our "hello world" example

In [1]:
# Edit the line below to add your first name between the ""
# characters.
my_name = "John"

greeting = "Hello, world, I'm "

def say_hello(name):
    if name == "":
        print("You can't introduce yourself if you don't add your name!")
    else:
        print(greeting + name)
    
# Click the "play" button to execute this code.
say_hello(my_name)

Hello, world, I'm John


#### Using variables in Python
__Concept 1: storing something long in a variable.__  This is really using a variable as a constant, but note that Python doesn't have a constant as a "thing" - just don't change it.  What Python DOES have is a style guide for constants which is to use ALL_CAPS with words separated by underscores.

In [2]:
# This URL is long. Let's write it just once and store it in a
# variable to use later.
base_url = "thinkful-students.slack.com/messages/"
BASE_URL = "thinkful-students.slack.com/messages/"  #correct style is this.

# We're going to use newlines ("\n") to format things nicely.
empty_line = "\n"  #No good justification for this.

print("The Thinkful Slack community is a great place to get help or share your work.\n")
#print(empty_line)

print("#general-discussion is where everyone can chat about whatever they like")
print("You can get there by going to:")

# Now we're going to start using our `base_url` variable multiple
# times.
print(BASE_URL + "general-discussion \n")
#print(empty_line)

print("Data science specific conversation is good for #data-science, at:")
print(BASE_URL + 'data-science \n')
#print(empty_line)

print("There's also a #careers channel dedicated to job-hunting:")
print(BASE_URL + "careers \n")
#print(empty_line)

print("Be sure to find your mentor on Slack and introduce yourself to the community.")

# Be sure to click the "run" button at the top of this trinket if
# you haven't yet.

The Thinkful Slack community is a great place to get help or share your work.

#general-discussion is where everyone can chat about whatever they like
You can get there by going to:
thinkful-students.slack.com/messages/general-discussion 

Data science specific conversation is good for #data-science, at:
thinkful-students.slack.com/messages/data-science 

There's also a #careers channel dedicated to job-hunting:
thinkful-students.slack.com/messages/careers 

Be sure to find your mentor on Slack and introduce yourself to the community.


__Concept 2:__ Keeping track of a changing _State_ in a program, whether it be an iterator, an index, an accumulator, etc.

In [3]:
# Ask the user for input and store the result in a variable.
# Note that bottle_count has been defined as an integer, so we would 
# cause an error if we didn't input one.
bottle_count = int(input("How many bottles of beer are on the wall?"))

# The format method is a common way to insert a variable's value into 
# a printed string.
while bottle_count > 0:
    print("There are {} bottles of beer on the wall.".format(bottle_count))

    if input("Would you like to take any down?") == "yes":
        removed_bottles = int(input("Ok, how many?"))
        bottle_count = bottle_count - removed_bottles
        print("There are now {} bottles on the wall".format(bottle_count))
    else:
        print("Ok, there are still {} bottles on the wall.".format(bottle_count))

# Click "Run" above to run this program. If you break it just click
# "Run" again or choose "Reset" from the menu at the top left.

How many bottles of beer are on the wall?99
There are 99 bottles of beer on the wall.
Would you like to take any down?yes
Ok, how many?99
There are now 0 bottles on the wall


As is typical, "=" and "==" are very different things in Python.

### Naming Rules and Conventions for Variables
1. Variable names must starte with a letter or an underscore.
2. The remainder of your variable name may include letters, numbers, and underscores, and
3. The variable name cannot be a reserved word.
    <br><br>
4. _Style_: Constants are to be in all caps.
5. _Style_: Use StudlyCaps for class names. 
6. _Style_: Variable, method, and function names should always be snake_case.

### Playing around with Data Types
#### type str (strings)

In [4]:
# Start by assigning a string to our variable.
food = "ham"

# Let's see what the built-in `type()`function does.
# I'm expecting to see "string" as the variable type.
type_of_food = type(food)
print(type_of_food)

# But what about now?
type_of_type = type(type_of_food)
print(type_of_type)


<class 'str'>
<class 'type'>


OK, in retrospect both of these make sense.  The class name of string <br>
variables is "str" and that class is of a type

In [5]:
# We can access the characters of a string by index with bracket
# notation, starting at index zero.
first_letter = food[0]
second_letter = food[1]
print("The first letter is " + first_letter)
print(type (first_letter))
print("The second letter is " + second_letter)
print(type (second_letter))


The first letter is h
<class 'str'>
The second letter is a
<class 'str'>


__Note:__ Unlike other languages, there is not Char data type in Python.

#### type int and type float (and complex)

In [6]:
# begin by loading the numpy package
import numpy as np

In [7]:
# Define the constant PI
# PI = 2*np.math.asin(1)
# no real need, it is already defined in the math package.
print(np.math.pi)

3.141592653589793


Python seems to understand complex numbers just fine.

Of note, exponentiation is ** instead of ^ as is more typical.

In [8]:
print((-1)**.5)

(6.123233995736766e-17+1j)


In [9]:
print((3+4j)*(3-4j))

(25+0j)


In [10]:
type(25+0j)

complex

In [11]:
type(2.3+5.7j)

complex

#### type Boolean 
The only things which equate to "False" are None, "", 0 and False

In [12]:
username = None
if (username):
    print('Yippee')
else:
    print('bummer')

username = "Jack"
if (username):
    print(username)
else:
    print('bummer')
    
username = 0
if (username):
    print(username)
else:
    print('bummer')
    
username = "Jack"
if (username=="Jill"):
    print(username)
else:
    print('bummer')

bummer
Jack
bummer
bummer


#### Casting into other data types

In [13]:
print(str(42))
print(float("42.5") + 7)
print(bool(""))
print(int(bool("")))
print(int(bool("Yippee")))

42
49.5
False
0
1


### Collections of Data
#### Lists 
Lists are ordered collections of data. To create a list simply wrap other data, separated by commas, in square brackets [ and ]:

In [14]:
inventory = ["beans", "coin", "tome"]
tome_dimensions = [8.5, 11, 2]

You can put any value into a list that you like, even another list. Just like the characters in a string, the elements in a list are ordered and can be accessed by index with bracket notation.

In [15]:
# Start by assigning some lists to variables.
inventory = ["beans", "coin", "tome"]
tome_dimensions = [8.5, 11, 2]

# Lists in lists...
random_stuff = [True, 3.14, ["pie", "pizza", "automobile"], inventory]
battleship_board = [[1, 1, 0], [1, 0, 1], [0, 0, 1]]

# Just like with strings you can access list elements by index
# with bracket notation. We start counting at zero:
print(inventory[0])
print(random_stuff[3])

beans
['beans', 'coin', 'tome']


In [16]:
# With nested lists you can continue digging down with additional
# indexes:
print(random_stuff[2][0])

# Lists have a particular length.
inventory_size = len(inventory)
print("You have {} items in your inventory".format(inventory_size))

# Lists are easy to modify. We'll cover this in more detail later.
inventory.append("magic sword")
print(inventory)

# Can you think of a way to print 3.14 solely by referencing the
# variables above?
print(random_stuff[1])

pie
You have 3 items in your inventory
['beans', 'coin', 'tome', 'magic sword']
3.14


#### Dictionaries
Dictionaries are just like lists except they use keywords instead of index numbers.  They are also not considered ordered lists.

In [17]:
adventurer = {"name": "grae", "profession": "magician"}

In [18]:
print(adventurer['profession'])

magician


In [19]:
mixed_up = {"ages": [6,7,10,12], "IQ": [103, 120, 96, 42]}

In [20]:
print(mixed_up["ages"][2])

10


In [24]:
hero = {
    "name": None,
    "species": "Human",
    "strength": 4,
    "magic": 5,
    "profession": None,
}

# Let's check the hero's name again. Just like lists, we use
# bracket notation.
if not hero["name"]:
    # We modify dictionary values just like we access them.
    hero["name"] = input("What is your name?")
    print("Fantastic, thanks {}".format(hero["name"]))
    
# Nice how if you use the Boolean property of any none empty as "fales
if not hero["name"]:
    # We modify dictionary values just like we access them.
    hero["name"] = input("What is your name?")
    print("Fantastic, thanks {}".format(hero["name"]))
else:
    print("Cool. {} you are my favorite hero.".format(hero["name"]))

What is your name?JOhn
Fantastic, thanks JOhn
Cool. JOhn you are my favorite hero.
