#Arrays and Vectors

The Python types we've seen so far all belong to the Python standard library.  This means that they're automatically available when you start writing Python code.  As with any modern language, however, programmers are always adding new functionality to Python and they often want to share their code with others.  The main way this is done is with packages.

A package contains extra object types, functions, and variables that are not included in base Python.  Some packages are specialized to niche applications, while others are so popular that nearly every coder uses them.

Now one particular package, numpy, sits as the foundation for scientific and data computing in python. Many other data science packages are built on top of numpy, so it's very important to understand its function.

[Numpy](http://www.numpy.org/) is the core of the [pydata](http://pydata.org/) ecosystem.

------

*From the Numpy Website*

NumPy is the fundamental package for scientific computing with Python. It contains among other things:

- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities

------

We'll often use numpy to work with data.  In this notebook, we'll begin introducing its functionality with a few simple examples.

To use the functionality provided by a package, we first have to import it into our program.  This is done with the import keyword.  At the point at which we import a package, we can also rename it with the `as` keyword. Traditionally, numpy is renamed to np. The reason that we do this is that there is a convention in the pydata community to import data packages with shorter names.


In [1]:
import numpy as np

In [4]:
np.arange(0,20)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

Now this should look pretty familiar; the arange function is analogous to the range function we saw earlier.  Instead of returning a range, however, this function returns something we call an array.

Arrays are similar to the sequences we saw earlier.  For example, we can access items by their indices.

In [5]:
my_array = np.arange(20)
my_array[2] = 12
my_array

array([ 0,  1, 12,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

Arrays also have some unique characteristics.  For one thing, all the objects contained in an array must be the same type.  This may seem like a pesky restriction, but it allows arrays to work very efficiently on large amounts of data.

In [14]:
my_array[5] = "shoe"

ValueError: invalid literal for int() with base 10: 'shoe'

The data type of an array can be accessed through the dtype property, this will give us information about all the objects stored in the array. That means that you can't change the type of an array once it's created (unless you work around this).

In [16]:
my_array.dtype

dtype('int64')

Another thing about arrays is that once they're created, their size is fixed. You can't pop or append to an array. That's because of the way that arrays allocate memory under the hood. The fixed size allows arrays to be optimized for high performance.

In [6]:
new_array = np.array([5,10,15,20])

In [7]:
new_array.append(5)

AttributeError: 'numpy.ndarray' object has no attribute 'append'

In [8]:
new_array.pop()

AttributeError: 'numpy.ndarray' object has no attribute 'pop'

You can see in the error messages that the arrays we're dealing with have type `'numpy.ndarray'`. This stands for n-dimensional array.  We can make n anything that we want.

For example, we can create a 2-dimensional matrix with random numbers between 0 and 1.

In [9]:
np.random.rand(5,5)

array([[ 0.23759964,  0.92623159,  0.89596264,  0.86149774,  0.61031495],
       [ 0.34947716,  0.74066764,  0.46579163,  0.20468334,  0.79393535],
       [ 0.84405822,  0.76617819,  0.51059147,  0.65829251,  0.31532268],
       [ 0.57141397,  0.01661977,  0.79456107,  0.31164229,  0.72433503],
       [ 0.24072898,  0.52185214,  0.94749936,  0.47252109,  0.74999639]])

Or a cube.

In [10]:
np.random.rand(5,5,5)

array([[[ 0.13240316,  0.67295571,  0.39641511,  0.76284256,  0.79563024],
        [ 0.14449041,  0.16531911,  0.29541863,  0.2832525 ,  0.84724166],
        [ 0.93559034,  0.10699395,  0.35908563,  0.39600501,  0.93883066],
        [ 0.83834294,  0.23477842,  0.97508993,  0.2139928 ,  0.99263925],
        [ 0.89175956,  0.22302191,  0.43772451,  0.85588403,  0.28807383]],

       [[ 0.54605163,  0.4283227 ,  0.7974993 ,  0.72614286,  0.60133047],
        [ 0.51267086,  0.57327063,  0.89185793,  0.95383522,  0.56109371],
        [ 0.74257783,  0.55680615,  0.85579919,  0.25425447,  0.48351766],
        [ 0.94654672,  0.42364808,  0.59804291,  0.41736754,  0.13476915],
        [ 0.43276291,  0.49085213,  0.10983201,  0.32395373,  0.76576228]],

       [[ 0.66670408,  0.33452593,  0.83695886,  0.67013543,  0.08078705],
        [ 0.86371776,  0.21748367,  0.04645938,  0.96619443,  0.82305266],
        [ 0.30855464,  0.27498444,  0.84865146,  0.98669306,  0.9787175 ],
        [ 0.30641559,

We can keep going to higher and higher dimensional spaces, but most applications only require 2 or 3 dimensions.

*now add some of the special methods that it provides*