# NUMPY 

## Importing Data into a numpy Array


- A numpy array can only contain one type of variable (e.g., str or int)
- While creating arrays, numpy assumes that data consists of floating point values (float). Thus, it shoul be specified what the data type (dtype) is.
    - There are codes for this:
        - 'Float64'   : 64-bit floating-point number
        - 'uint32'    : 32-bit unsigned integer 
        - 'U75'       : 75 byte unicode data type
        - (str, 35)   : 35-character string
        - ('U', 10)   : 10-character unicode string

In [166]:
import numpy

my_data = numpy.genfromtxt("test_data//world_alcohol.csv", delimiter = "," , dtype="U75", skip_header=1)

In [168]:
print(my_data)

[['1986' 'Western Pacific' 'Viet Nam' 'Wine' '0']
 ['1986' 'Americas' 'Uruguay' 'Other' '0.5']
 ['1985' 'Africa' "Cte d'Ivoire" 'Wine' '1.62']
 ..., 
 ['1986' 'Europe' 'Switzerland' 'Spirits' '2.54']
 ['1987' 'Western Pacific' 'Papua New Guinea' 'Other' '0']
 ['1986' 'Africa' 'Swaziland' 'Other' '5.15']]


## Creating numpy Arrays

In [169]:
my_vector = numpy.array([10, 20, 30])
my_matrix = numpy.array([[1, 2, 3], [10, 20, 30]])

print(my_vector)
print("")
print(my_matrix)

[10 20 30]

[[ 1  2  3]
 [10 20 30]]


If a numpy array contains elements of multiple types, they will be converted to one type:

In [170]:
my_mixed_data_matrix = numpy.array([ [1, 'two', True], [10, 20, 30] ])
my_mixed_data_matrix # ('>U11' in the output below means 'string')

array([['1', 'two', 'True'],
       ['10', '20', '30']], 
      dtype='<U11')

## Queries on a numpy Array

### Array Dimensions
**.shape**

In [171]:
my_vector = numpy.array([10, 20, 30])
my_matrix = numpy.array([[1, 2, 3], [10, 20, 30]])

print(my_vector.shape)
print(my_matrix.shape)

(3,)
(2, 3)


### Array Type
**.dtype**

In [172]:
my_matrix.dtype

dtype('int32')

## Indexing and Slicing numpy Arrays
**[x,y] | [:, y] | [a:b,x:y] | [:,x:y]**

Also see 'Indexing with Numpy' section in Python notebook. The inline comments are tuned for that section.

<a name="indexing"></a>
### Indexing
**[x,y] | [:, y] | [a:b,x:y] | [:,x:y]**

In [173]:
my_matrix = numpy.array([
    [5, 10, 15],
    [20, 25, 30],
    [35, 40, 43]
])

Select a specific cell:  

In [174]:
                       ####################################################################
print(my_matrix[0,2])  # Read '[0,2]'as '1st and 3rd'                                     #
                       # When indexing in two dimensions, read '[x, y]' as 'x+1 and y+1'  #
                       ####################################################################
                       # Read as: Select...
                       # first row
                       # third column 

15


Select an entire row:

In [175]:
print(my_matrix[0,:])  # Read as: Select...
                       # first row
                       # all columns (':' means 'all' in numpy) 
                       #
                       # Selects first row 
                       #
                       # my_matrix[0] would also select the first row, 
                       # but this would not be an explicit notation for numpy.

[ 5 10 15]


Select and entire column:

In [176]:
print(my_matrix[:,0])  # Read as: Select...
                       # all rows
                       # the first column
                       # 
                       # Selects column 1 

[ 5 20 35]


Select two rows and a column:

In [177]:
print(my_matrix[1:3,2]) # Read as: Select...
                        # rows 2 and 3 (reminder: read "[x:y]" as "x+1 to y")
                        # column 3 (reminder    : read "[z]" as "z+1"))
                        #
                        # Selects rows 2 and 3 of 3rd column

[30 43]
