# 1D Data - NumPy and Pandas

## <center> Pandas' Series vs NumPy's Arrays </center>

### NumPy Arrays

NumPy Arrays are somewhat similar to Python Lists. Below are some the similarities and difference of the two:

<img src = ../figures/sim_diff_table.png />

### Test Codes for NumPy Arrays

In [3]:
# Import NumPy and Pandas
import numpy as np
import pandas as pd

In [12]:
# Declaring a 1D NumPy Array

# First 20 countries with employment data
countries = np.array([
    'Afghanistan', 'Albania', 'Algeria', 'Angola', 'Argentina',
    'Armenia', 'Australia', 'Austria', 'Azerbaijan', 'Bahamas',
    'Bahrain', 'Bangladesh', 'Barbados', 'Belarus', 'Belgium',
    'Belize', 'Benin', 'Bhutan', 'Bolivia',
    'Bosnia and Herzegovina'
])

# Employment data in 2007 for those 20 countries
employment = np.array([
    55.70000076,  51.40000153,  50.5       ,  75.69999695,
    58.40000153,  40.09999847,  61.5       ,  57.09999847,
    60.90000153,  66.59999847,  60.40000153,  68.09999847,
    66.90000153,  53.40000153,  48.59999847,  56.79999924,
    71.59999847,  58.40000153,  70.40000153,  41.20000076
])

In [13]:
# Accessing elements
# Elements can be accessed individually thru indexing
# The first element can be accessed by index 0, and the last by index length-1

print countries[0]     # Prints the 1st element
print countries[3]     # Prints the 4th element
print countries[19]    # Prints the last element

Afghanistan
Angola
Bosnia and Herzegovina


In [39]:
# Slicing
# Range of elements can be accessed thru slicing
# Similar to Python's slicing, inclusive of the first index but exclusive of the last index

print countries[0:3]   # Prints the 1st to 3rd element
print countries[:3]    # Prints the 1st to 3rd element
print countries[17:]   # Prints the 18th to last element
print countries[:]     # Prints the whole array

['Afghanistan' 'Albania' 'Algeria']
['Afghanistan' 'Albania' 'Algeria']
['Bhutan' 'Bolivia' 'Bosnia and Herzegovina']
['Afghanistan' 'Albania' 'Algeria' 'Angola' 'Argentina' 'Armenia'
 'Australia' 'Austria' 'Azerbaijan' 'Bahamas' 'Bahrain' 'Bangladesh'
 'Barbados' 'Belarus' 'Belgium' 'Belize' 'Benin' 'Bhutan' 'Bolivia'
 'Bosnia and Herzegovina']


In [22]:
# Element types

#?countries.dtype                                              # Displays the docstring for dtype 

print countries.dtype                                          # Array of Strings with a max string len of 22
print employment.dtype                                         # Array of float64
print np.array([0, 1, 2, 3]).dtype                             # Array of int32
print np.array([1.0, 1.5, 2.0, 2.5]).dtype                     # Array of float64
print np.array([True, False, True]).dtype                      # Array of boolean (bool)
print np.array(['AL', 'AK', 'AZ', 'AR', 'CA']).dtype           # Array of Strings with a max string len of 2

|S22
float64
int32
float64
bool
|S2


In [40]:
# Looping
# Traversing the array is almost the same as going through a Python list

if False:                                                         # Change to True to run the code
    for country in countries:
        print 'Examining country {}'.format(country)
        
if False:                                                         # Change to True to run the code
    for i in range(len(countries)):
        country = countries[i]
        country_employment = employment[i]
        print 'Country {} has employment {}'.format(country,
                country_employment)        

In [41]:
# Numpy functions
# Some of Numpy functions

print employment.mean()
print employment.std()
print employment.max()
print employment.sum()

58.6850000385
9.33826911369
75.69999695
1173.70000077
