# Jupyter Notebooks Cells 

Jupyter notebooks allow you combine text and code using a system called markdown. In fact, this very cell is written in **markdown**. We use this formatting language to narrate the workshop and provide context. (*Imagine reading this notebook with no markdown*) Markdown is also used for documentation of code in Python notebooks more generally.

- Markdown
- Code

In [None]:
a = 2 + 6
print(a)

8


Control/Command+Enter executes the cell and does not move to the next cell.

You can enter the same line over and over again into the interpreter.

## Clearing Jupyter

Jupyter remembers line of code it executed, even if it's not currently displayed in the notebook. This means that deleting a line of code does not delete it from the notebook's memory if it has already been run. Instead, to clear everything from Jupyter use Kernel -> Restart in the menu. The kernel is basically the program actually running the code, so if you reset the kernel, it's as if you just opened up the notebook for the first time. All of the outputs are forgotten, and the variables are reset.



## Some Jupyter Tips

Tab Completion
Jupyter notebooks also allow for tab completion, just like many command line interpreters and text editors. If you begin typing the name of something (e.g., a variable, a file in the current directory, etc.) that already exists, you can simply hit Tab and Jupyter will autocomplete it for you. If there is more than one possibility, it will show them to you and you can choose from there. For example:

In [None]:
test_me = 1
test_me_2 = 2

Now try typing te and see what happens when you hit TAB.

## Indenting

Consistent indentation is essential in Python. Python pays close attention to blankspaces, and uses this to understand how you're structuring code. So, you're only supposed to add spaces or indents in places where Python expects you to - otherwise, you'll run into an error.

To move multiple lines of code at once, you can select them and then hit Control + ] to indent them (move to the right), or Control + [ to dedent them to the left.

For Macs, use Command in place of Control.

Read the error message down below. What is the error type? How can we fix it?

In [None]:
move_me = 1
  move_me_too = 'abc'

## Exiting Jupyter

When you close your Jupyter notebook window, all of your values will be lost. But you can save your code for a later time.

First go to File -> Close and Shutdown Notebook in order to shutdown the notebook you are using and close its window. Once all notebooks are shut down, you can shut down the entire Jupyter server by closing Anaconda navigator. You may get a warning dialog box alerting you that Jupyter Notebook is still running. Just click Quit to shut everything down.

# Variables

Variables are placeholders for useful values that we want to refer to again later in the code.
In Python, the = symbol assigns the value on the right to the name on the left.
The variable is created when a value is assigned to it. When you call the variable, it will refer to whatever value it currently holds.
Here's Python code that assigns a year to a variable year and a month in quotation marks to a variable month.

In [None]:
year = 2018
month = 'July'

# We can print variables with print()
print(year)
print(month)

2018
July


In [None]:
print("Year:", year, ". Month:", month, sep = '' )

Year:2018. Month:July


# What is a Data Type?

Every value in a program has a specific type. Types tell Python how to interact with a variable. For example, you can use the division operation with numbers, but not text.
Sometimes types are obvious, but sometimes they can surprise us.
We use the type() function to identify what the type is of a current variable. Functions are signified by parentheses following them, which contain any inputs to the function.

In [None]:
pi = 3.14159
print(type(pi))

fitness = 'average'
print(type(fitness))


pi2 = '3.14159'
print(type(pi2))
type(pi2)

<class 'float'>
<class 'str'>
<class 'str'>


str

# Functions

Functions are a core part of programming that allows us to run complex operations over and over without needing to write the code over and over again. Arguments, or values passed to a function, allow for us to use functions in more general ways.

- Function blocks begin with the keyword **def** followed by the function name and parentheses ( ( ) ).

- Any input parameters or arguments should be placed within these parentheses. You can also define parameters inside these parentheses.

- The first statement of a function can be an optional statement - the documentation string of the function or docstring.

- The code block within every function starts with a colon (:) and is indented.

- The statement return [expression] exits a function, optionally passing back an expression to the caller. A return statement with no arguments is the same as return None.


In [None]:
def functionname( parameters ):
   "function_docstring"
   function_suite
   return [expression]

In [None]:
def power(a, b):
    """Returns arg1 raised to power arg2."""
    return a**b

print(power.__doc__)

Returns arg1 raised to power arg2.


In [None]:
def foo(a):
    b=1+a
    return (b)

foo(5)


def bar(c):
    return 10 * foo(c)

bar(7)

hello from within foo
hello from within foo


80

In [None]:
def my_function():
  print("Hello from a function")
my_function()

Hello from a function


In [None]:
def my_func():
    x = 10
    print("Value inside function:",x)

x = 20
my_func()
print("Value outside function:",x)

Value inside function: 10
Value outside function: 20


# Lists
Lists are a collection of ordered items. Lists have a length, and the items inside can be indexed, or accessed based on their positions.

They're most useful when storing a collection of values when order is important. One nice thing about lists is that they can contain different types of data. For example, the entries of a list can be integers, floats, strings, and even other lists!

We specify a list with square brackets: [] and commas separating each entry in the list.

Lists have their own methods which perform operations specific to lists. The most common method is the append() method, which adds an item to the end of a list.

In [None]:
country_list = ["Afghanistan", "Canada", "Thailand", "Denmark", "Japan"]
type(country_list)

list

In [None]:
country_list.append('USA')
print(country_list)

['Afghanistan', 'Canada', 'Thailand', 'Denmark', 'Japan', 'USA']


There are many other useful list methods. Use the [documentation](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) to investigate available methods for dealing with lists.

# Dictionaries

Dictionaries are organized on the principle of key-value pairs. The keys can be used to access the values. They're most useful when you have unordered data organized in pairs. This occurs, for example, in storing metadata (data describing other data).

Keys can be ints, floats, or strings, and are unordered. Values, however, can be any data type.

Dictionaries are specified in Python using curly braces, with colons separating keys and values.

Let's take a look at an example dictionary:

In [None]:
example_dict = {
    "name": "Forough Farrokhzad",
    "year of birth": 1935,
    "year of death": 1967,
    "place of birth": "Iran",
    "language": "Persian"}

example_dict['year of birth']

print("The dictionary before remove is : ", example_dict)

The dictionary before remove is :  {'name': 'Forough Farrokhzad', 'year of birth': 1935, 'year of death': 1967, 'place of birth': 'Iran', 'language': 'Persian'}


In [None]:
del example_dict['year of birth']

print("The dictionary after remove is : ", example_dict)

The dictionary after remove is :  {'name': 'Forough Farrokhzad', 'year of death': 1967, 'place of birth': 'Iran', 'language': 'Persian'}


# Data Frames

A common data structure you've likely already encountered is tabular data. Think of an Excel sheet: each column corresponds to a different feature of each datapoint, while rows correspond to different samples.

In scientific programming, tabular data is often called a "data frame". In Python, there a specialized library called **pandas** which contains an object DataFrame that implements this data structure.

We're going to explore pandas more closely in Session 2, but let's try creating a **DataFrame** object right now.

First, we need to create a dictionary:

In [None]:
fruit = ['apple', 'orange', 'mango', 'strawberry', 'salmonberry', 'thimbleberry']
size = [3, 2, 3, 1, 1, 1]
color = ['red', 'orange', 'orange', 'red', 'orange', 'red']

fruits = {
    'fruit': fruit,
    'size': size,
    'color': color}

Next, we import the pandas library and pass in the dictionary to the **pd.DataFrame()** function, storing the result in a variable called df.

In [None]:
import pandas as pd

df = pd.DataFrame(fruits)
df

Unnamed: 0,fruit,size,color
0,apple,3,red
1,orange,2,orange
2,mango,3,orange
3,strawberry,1,red
4,salmonberry,1,orange
5,thimbleberry,1,red


The keys became column names and the values became cells in the DataFrame. In addition, there is an **index** on the left that keeps track of the row.

Objects can also have **attributes**, or variables associated with the data type. We can get the number of columns and rows with **df.shape**, an attribute of the dataframe.

How many rows and columns does this dataframe have?

In [None]:
df.shape

(6, 3)

**Question**:
The following code gives an error. Why does it have an error? What are some ways to fix this?

In [None]:
fruit = ['apple', 'orange', 'banana']
length = [3.2, 2.1, 3.1]
color = ['red', 'orange', 'yellow']

fruit_dict = {
    'fruit': fruit,
    'length': length,
    'color': color}

df_fruit = pd.DataFrame(fruit_dict)
df_fruit.shape

(3, 3)

We can choose a single column by selecting the name of that column. pandas calls this a pd.Series object. The act of obtaining a particular subset of a data frame is often referred to as **slicing**. This uses bracket notation to select part of the data.

Check the type of the slice below:

In [None]:
# Bracket notation to choose a column
df['fruit']

In [None]:
#number of unique colors in the df
print(df['color'].nunique())


#unique colors in the df
print(df['color'].unique())

2
['red' 'orange']


**Questions**: 
There is another pandas function .value_counts() which can be used to help organize the information provided by unique(). Read the [documentation](https://pandas.pydata.org/docs/reference/api/pandas.Series.value_counts.html) and apply value_counts() to the df variable. How many 'red' and 'orange' fruits are in the DataFrame?

In [None]:
df_fruit['length'].value_counts()

KeyError: 'length'

# Loops

A **for** loop tells Python to execute some statements once for each value in a list, a character string, or some other set of values. Specifically, we structure our computation as: "for each thing in this group, do these operations".

In [None]:
tires = [41.2,35.7,28.4]

for pressure in tires:
    print(round(pressure))

print('the loop has ended')

41
36
28
the loop has ended


**Question**:
For the data frame below, let's convert each elevation from feet to meters. Use the conversion: 1 ft = .304 m.

Proceed as follows:

- Extract the column elevation as a Series.
- Loop through the series.
- Convert each value to meters.
- Print the result.

In [None]:
import pandas as pd

mountains_df = pd.DataFrame(
    {'mountain': ['Mt. Whitney',
                  'Mt. Williamson',
                  'White Mountain Peak',
                  'North Palisade',
                  'Mt. Shasta',
                  'Mt. Humphreys'],
     'range': ['Sierra Nevada',
               'Sierra Nevada',
               'White Mountains',
               'Sierra Nevada',
               'Cascade Range',
               'Sierra Nevada'],
     'elevation': [14505, 14379, 14252, 14248, 14179, 13992]}
)

mountains_df

Unnamed: 0,mountain,range,elevation
0,Mt. Whitney,Sierra Nevada,14505
1,Mt. Williamson,Sierra Nevada,14379
2,White Mountain Peak,White Mountains,14252
3,North Palisade,Sierra Nevada,14248
4,Mt. Shasta,Cascade Range,14179
5,Mt. Humphreys,Sierra Nevada,13992


In [None]:
elevation=mountains_df['elevation']

for e in mountains_df['elevation']:
    print(e*.304)

4409.5199999999995
4371.215999999999
4332.608
4331.392
4310.416
4253.568


In [None]:
def height():
    for i in mountains_df['elevation']:
        x=i*.304
    print(x)
    
height()

4253.568
