## Mixed type arrays

It is possible to create arrays of mixed types, by specifying a list of types and names.

In [1]:
# Import
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
# Create a list with 10 tuples (name, integer_number, decimal_number)
# Create an array of mixed types by specifying field names and types
# Print the type of the array
nombres = np.loadtxt("data/50_nombres.txt", dtype="str")
lista = [(np.random.choice(nombres), np.random.randint(10), np.random.random()*10) for n in range(10)]

tipos = [("nombre", "|S9"), ('entero', "int"), ('decimal', "float")]
arr = np.array(lista, dtype=tipos)
print arr.dtype

[('nombre', 'S9'), ('entero', '<i4'), ('decimal', '<f8')]


In [3]:
# To get the names of the fields we will do it with array.dtype.names
print arr.dtype.names

('nombre', 'entero', 'decimal')


In [4]:
# We can get the values of any field by indexing with its name.
res = arr['entero']
print res
print res.dtype

[4 2 8 6 7 8 7 7 8 1]
int32


***
### genfromtxt function

In addition to numpy's functions for saving-loading the contents of arrays into files, numpy has the genfromtxt function, which provides a more advanced way to read data files in text format.

>`numpy. genfromtxt(fname, dtype='float', comments='#', delimiter=None, skip_header=0, skip_footer=0, converters=None, missing_values=None, filling_values=None, usecols=None, names=None, excludelist=None, deletechars=None, replace_space='_', autostrip=False, case_sensitive=True, defaultfmt='f%i', unpack=None, usemask=False, loose=True, invalid_raise=True, max_rows=None)`.

> Most important arguments:

> - *fname* : object file or text string with file path.
> - *dtype*: data type of the output array, if None is given, the types will be determined according to the contents of each column.
> - *delimiter*: column separator character. It also accepts an integer or list of integers as widths for each field.
> - *names*: (None, True, str, sequence) Names to identify the columns. If True, names are read from the first line after skip_header. If not specified and the result is an array of mixed types, the field names will be "f0", "f1", "f2", ..., "fn".

> Other arguments:

> - *comments*: character for lines containing comments (# by default).
> - *skip_header*: number of lines to skip to the beginning of the file.
> - *skip_footer*: number of lines to skip to the beginning of the file.
> - *missing_values*: set of strings to represent missing values
> - *filling_values*: set of values to use for missing data
> - *usecols*: tuple with the columns to read (in case we only want to read some columns) > - *excludelist*: tuple with the columns to read (in case we only want to read some columns)
> - *excludelist*: list with names to be excluded
> - *deletechars*: text string with characters to delete from names
> - *autostrip*: boolean that indicates whether to remove blanks from the values (True) or not (False)

> If we use this function and specify dtype = None and usecols =True, a mixed type ndarray is generated.

In [7]:
# Reads the data from the file "data/muni_andalucia.csv".
# The first row contains the names of the fields, and the columns are separated by ";"
# Gets an array of mixed types with names
# Print the field names
arr = np.genfromtxt("data/muni_andalucia.csv", delimiter = ";", dtype = None, names = True)
print arr.dtype.names

('Provincia', 'Municipio', 'Area', 'POB_2011', 'POB_2010', 'POB_2009', 'POB_2008', 'POB_2007', 'POB_2006', 'POB_2005', 'POB_2004', 'POB_2003', 'POB_2002', 'POB_2001', 'POB_2000')


In [8]:
# Calculates the change in population from 2000 to 2010.
# Saves the result in a new array
# Prints the first 10 elements of the resulting array
cambios = arr['POB_2010'] -  arr['POB_2000']
print cambios[:10]

[  9.65000000e+02   2.88800000e+03  -1.87000000e+02  -2.60000000e+01
  -1.00000000e+00   2.12000000e+02   3.40000000e+01   7.54000000e+02
  -1.39000000e+02   9.30000000e+02]
