## Notebook Topic: Getting Familiar with Python

<ins>Learning Objectives</ins>

1. To deal with missing data
2. To check object types

**Section III: Missing Data**

When dealing with real data, you may often have to deal with missing data.  

For example, you collect signals from individuals using your companies Global Positioning System via satellite.  You do this in order to improve the maps being used by clients.  The satellite collects clear signals when weather is good but cannot collect any signals when weather is bad (like a severe thunderstorm).  On that particular day, your GPS cannot provide accurate estimated times of arrival, inform you of upcoming traffic jams, or warn you about the cops up ahead.

Because missing data is more common than we think, we need to be prepared to deal with the situation!

* Missing data will typically appear is as ```NaN``` or ```nan```.

Less common, you may encounter data that is represented as a positive or negative infinity, so we will learn how to idenfity those too.



In [1]:
# the math library has some useful functions
import math

x = math.nan
y = 3

math.isnan(y)

False

What happens when you check x?

In [2]:
w = -math.inf

math.isfinite(w)

False

Now, what happens when we have a vector containing these values?  Let's find out!

In [3]:
# remember we need to use the numpy library for vectors
import numpy as np

z = np.array([y, x, w])

In [4]:
np.isinf(z)

array([False, False,  True])

In [5]:
np.isnan(z)

array([False,  True, False])

In [6]:
np.isfinite(z)

array([ True, False, False])

What is the primary difference between the checks on **z** versus what we did to **x**, **y**, and **w** directly?

**Section IV: Object Type Checks**

Essentially what we did was to see if our objects (the simple **x**, **y**, and **w**; and elements **x**, **y**, and **w** from the array **z**) were of type nan, infinity, or finite.  We can check objects types more broadly as we'll see in this section.

**Object Types**

In the *00 - Pre_class.ipynb* notebook, we used a function called ```type```.  If you don't remember, go review that notebook!

There are a few basic object types in Python:

*   Numeric (such as integers, floats)
*   Strings (individual letters, words, and sentences; all denoted with quotation marks)
*   Logical/Boolean (True and False)

*type* will inform you whether a particular variable is of one of the types above.

In [7]:
# does this return numeric?
type(math.inf)

float

In [8]:
# what about this?
type(math.nan)

float

There are several higher level object types.  

* Vectors (which we've seen a few of already) and matrices (we'll see in the near future) can only be all numeric or all strings or all boolean, etc.  We will become better acquianted with the *numpy* library for these.

* Data frames can have columns *from* each of the basic types.  Data frames are an object we will use a lot for data analysis.  One of the main libraries we will work with involving these data types is *pandas*.


**Section V: Casting Types**

You can switch object types as needed.  But this only makes sense to do for some variable relationships.  For example,

*   string <-> numeric
*   boolean <-> numeric

In [9]:
# when you have a vector of strings of numbers
alpha = np.array(["1", "2", "3", "4", "5"])
alpha.dtype

dtype('<U1')

In [11]:
alpha.astype('float')


array([1., 2., 3., 4., 5.])

In [13]:
# when you have numbers you need to be strings
x = np.array([np.random.normal(loc = 0, scale = 1) for i in range(1,6)])
x


array([ 0.66634752, -0.50086467,  0.10832358, -0.93509705, -1.52857658])

In [14]:

y = [str(i) for i in x]
type(y[0]) #is this true for all 5 elements? check in a new code chunk

str

## Conclusion

Name at least four different things you *think* you were __meant__ to learn from this notebook.

**Note** I highly recommend you read Chapter 1-3 for an alternative introduction to Python and Jupyter notebooks.