# General Concepts

In [2]:
import numpy as np

In [5]:
a = np.arange(5)
b = a
print(a)
print(b)

[0 1 2 3 4]
[0 1 2 3 4]


In [6]:
a[2] = 11
print(a)
print(b)

[ 0  1 11  3  4]
[ 0  1 11  3  4]


In [7]:
a[2] = 2

In [8]:
b = a.copy()

In [10]:
print(a)
print(b)

[0 1 2 3 4]
[0 1 2 3 4]


In [11]:
a[2] = 0
print(a)
print(b)

[0 1 0 3 4]
[0 1 2 3 4]


In [12]:
list(zip(a, b))

[(0, 0), (1, 1), (0, 2), (3, 3), (4, 4)]

In [13]:
a = np.random.randint(0, 10, 5)
b = np.random.randint(0, 10, 5)

In [18]:
A = np.column_stack((a, b))
A

array([[3, 0],
       [6, 9],
       [3, 0],
       [3, 9],
       [4, 0]])

In [20]:
np.apply_along_axis((lambda x: x[0]**2 + x[1]**3), axis=1, arr=A)

array([  9, 765,   9, 738,  16])

---

## Broadcasting

Broadcasting is the operation that allows an operator or a function to act on two or more arrays to operate even if these arrays do not have exactly the same shape.

In [23]:
a = np.random.randint(0, 10, 30).reshape(3, 2, 5)
b = np.random.randint(0, 10, 30).reshape(3, 5, 2)

In [24]:
a

array([[[8, 4, 0, 8, 2],
        [9, 2, 3, 2, 4]],

       [[0, 4, 5, 9, 0],
        [9, 4, 5, 8, 3]],

       [[2, 7, 0, 8, 1],
        [4, 6, 7, 3, 6]]])

In [25]:
b

array([[[1, 1],
        [5, 1],
        [4, 5],
        [4, 0],
        [8, 5]],

       [[2, 9],
        [7, 3],
        [1, 9],
        [2, 8],
        [7, 4]],

       [[0, 4],
        [7, 9],
        [5, 2],
        [8, 2],
        [9, 6]]])

In [29]:
arr = np.array([(1, 'sekar', 36), (2, 'saskia', 34)], dtype=('i2, a6, i4'))

In [30]:
arr

array([(1, b'sekar', 36), (2, b'saskia', 34)],
      dtype=[('f0', '<i2'), ('f1', 'S6'), ('f2', '<i4')])

In [36]:
arr = np.array([(1, 'first', 0.25, 2-2j), (2, 'second', 0.5, 2+2j)], 
               dtype=[('Id', 'int8'), ('Position', 'a6'),('Value', 'float32'),
                    ('Complex', 'complex64')])

In [35]:
structured = np.array([(1,'First',0.5,1+2j),(2,'Second',1.3,2-2j),(3,'Third',0.8,1+3j)],
dtype=[('id','i2'),('position','a6'),('value','f4'),('complex','c8')])

In [37]:
arr.dtype.names

('Id', 'Position', 'Value', 'Complex')

In [38]:
a = np.arange(10)

In [39]:
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [40]:
a.dtype

dtype('int32')

In [41]:
a.dtype.name

'int32'

---

# Reading and Writing Array Data on Files

## Loading and Saving Data in Binary Files

- Once you have an array to save, for example, containing the results of your processing during data analysis, you simply call the save() function, specifying as arguments the name of the file, to which .npy extension will be automatically added, and then the array itself. <br>
- But when you need to recover the data stored within a .npy file, you can use the load() function by specifying the file name as argument, this time adding the extension .npy.

In [46]:
M = np.array([(1, 'first',  0.0), 
              (2, 'second', 0.25),
              (3, 'third',  0.5),
              (4, 'fourth',  0.75),
              (5, 'fifth',  1.0)],
            dtype=[('id', 'int16'), ('position', 'str'),
                   ('value', 'float32')])

In [47]:
M

array([(1, '', 0.  ), (2, '', 0.25), (3, '', 0.5 ), (4, '', 0.75),
       (5, '', 1.  )],
      dtype=[('id', '<i2'), ('position', '<U'), ('value', '<f4')])

In [52]:
np.save('data', M)

In [53]:
data = np.load('data.npy')

In [54]:
data

array([(1, '', 0.  ), (2, '', 0.25), (3, '', 0.5 ), (4, '', 0.75),
       (5, '', 1.  )],
      dtype=[('id', '<i2'), ('position', '<U'), ('value', '<f4')])

To be able to read your data in a text file and insert them into an array, NumPy provides a function
called genfromtxt(). Normally, this function takes three arguments, including the name of the file
containing the data, the character that separates a value from another which in our case is a paragraph, and
whether it contains the column headers.

In [55]:
data = np.genfromtxt('data.csv', delimiter=',', names=True)

In [56]:
data

array([(1., 123., 1.4, 23.), (2., 110., 0.5, 18.), (3., 164., 2.1, 19.)],
      dtype=[('id', '<f8'), ('value1', '<f8'), ('value2', '<f8'), ('value3', '<f8')])

---

# Conclusions

In this chapter, you saw all the main aspects of the NumPy library and through a series of examples you got
familiar with a range of features that form the basis of many other aspects you’ll face in the course of the
book. In fact, many of these concepts will be taken from other scientific and computing libraries that are
more specialized, but that have been structured and developed on the basis of this library.
You saw how thanks to the ndarray you can extend the functionalities of Python, making it a suitable
language for scientific computing and in a particular way for data analysis.
Knowledge of NumPy proves therefore to be crucial for anyone who wants to take on the world of the
data analysis.
In the next chapter, we will begin to introduce a new library, pandas, that being structured on NumPy
will encompass all the basic concepts illustrated in this chapter, but extending them to make them more
suitable for data analysis.

---

# Important Points

- Indexing, slicing or accessing array will return a view, a reference not a copy of an object.
- column_stack, row_stack work the same way as stack but only for one-dimensional array only
- broadcasting has rules, for example if the operation happens in onedimensional array and a matrix, one-dimensional will turn to two-dimensional(matrix), then broadcast along the intended axis to have a same shape as the matrix.
- np.save(return .npy), np.load, np.genfromtxt(load an csv file, contain delimiter and names keywords)