# Arrays in python
Python provides several types of array:
* list
* tuple
* Numpy array


### Lists
* These are structures containing several individual pieces of data; you can mix data types within a list. 
* They are declared using square brackets and indexed starting at 0. 
* They can be modified item-by-item. 
* `listname.append(obj)` is a method, which can insert a new element to the end of the list only.
* `listname.insert(arg1,obj)` is a method, which can insert a new element at any place of the list so it is taking one argument.arg1= position

In [5]:
a = [3.0,5,'bello',3+4j]

In [6]:
a[0]+1.5
a[2]+',world'
a[0]+=1.5; a # note that you can put two commands on one line with a ;

[4.5, 5, 'bello', (3+4j)]

### Tuple
* Tuples are like list but are immutable.
* Tuples are written with round brackets.u

In [7]:
a = (3.0,4.0)
a[0]+1 # this is legal
a

(3.0, 4.0)

In [8]:
a[0]+=1# try this - will give an error

TypeError: 'tuple' object does not support item assignment

## Numpy array
Numpy arrays are the closest things to arrays from most computer languages. The script must explicitly import the numpy package with the statement
```
import numpy as np
```

In [9]:
import numpy as np

A second very powerful feature is implicit array indexing. Suppose that array a is a two dimensional array:
```
10 17 20
13 41 50
18 19 34
```

In [10]:
#You would declare this array using:
a=np.array([[10,17,20],[13,41,50],[18,19,34]])
print(a)

[[10 17 20]
 [13 41 50]
 [18 19 34]]


In [11]:
#define the numpy array
a = np.array([73,67,98])
b = np.array([0.2,0.3,0.1])

In [12]:
# dot product
# multiply two array element wise then summation
dot_product = np.dot(a,b)
dot_product

44.5

In [13]:
#or dot product 
total = a*b     #multiply two array
total.sum()

44.5

## python list vs Numpy
### Numpy operations written in the C++ here calculations are much faster

In [14]:
# python list
arr1 = list(range(1000000))
arr2 = list(range(1000000,2000000))

# Numpy array
arr1_np = np.array(arr1)
arr2_np = np.array(arr2)

In [15]:
%%time
result = 0

for x1,x2 in zip(arr1,arr2):
    result += x1*x2
result

CPU times: user 152 ms, sys: 0 ns, total: 152 ms
Wall time: 152 ms


833332333333500000

In [16]:
%%time
np.dot(arr1_np,arr2_np)

CPU times: user 3.82 ms, sys: 3.68 ms, total: 7.5 ms
Wall time: 4.87 ms


833332333333500000

# 2 dimentional numpy array
### array = row x coloumn
### Matrix multiplication

In [26]:
galaxy_properties=([12,34,56,78],[12,67,45,90],[81,32,67,75],[89,91,77,51],[9,87,61,23])
galaxy_properties
galaxy_properties_np = np.array(galaxy_properties)

In [27]:
galaxy_properties_np

array([[12, 34, 56, 78],
       [12, 67, 45, 90],
       [81, 32, 67, 75],
       [89, 91, 77, 51],
       [ 9, 87, 61, 23]])

In [29]:
galaxy_properties_np.shape

(5, 4)

In [30]:
number=np.array([1,2,3,4])
number
print(number.shape)

(4,)


In [31]:
# matrix multiplication
np.matmul(galaxy_properties,number)

array([560, 641, 646, 706, 458])

In [33]:
(np.matmul(galaxy_properties,number)).shape

(5,)

In [34]:
# matrix multiplication shortcut is just @ in the Numpy 
galaxy_properties @ number

array([560, 641, 646, 706, 458])

In [35]:
# Working with CSV
# Result ouput from numpy array 
# np.genfromtxt('file_name',delimiter= ',', skip_header= 1)
galaxy_properties = np.genfromtxt('galaxy.txt',delimiter= ',', skip_header= 1)

In [36]:
galaxy_properties

array([[25., 76., 99.],
       [39., 65., 70.],
       [59., 45., 77.],
       ...,
       [99., 62., 58.],
       [70., 71., 91.],
       [92., 39., 76.]])

In [37]:
galaxy_properties.shape

(10000, 3)

In [46]:
weight =np.array([0.1,.02,.03])

In [47]:
result = galaxy_properties @ weight
result

array([ 6.99,  7.3 ,  9.11, ..., 12.88, 11.15, 12.26])

In [48]:
result.shape

(10000,)

In [49]:
(galaxy_properties @ weight).shape

(10000,)

In [50]:
#Adding the result in a new coloumn of the galaxy properties. 
galaxy_result = np.concatenate((galaxy_properties,result.reshape(10000,1)),axis=1)
galaxy_result

array([[25.  , 76.  , 99.  ,  6.99],
       [39.  , 65.  , 70.  ,  7.3 ],
       [59.  , 45.  , 77.  ,  9.11],
       ...,
       [99.  , 62.  , 58.  , 12.88],
       [70.  , 71.  , 91.  , 11.15],
       [92.  , 39.  , 76.  , 12.26]])

In [51]:
galaxy_properties.shape

(10000, 3)

In [52]:
galaxy_result.shape

(10000, 4)

In [53]:
np.savetxt('galaxy_result.txt',galaxy_result,fmt='%.2f',header='flux,luminosity,sigma,SMBH', comments ='')

    Numpy provides hundreds of functions for performing operations on arrays. Here are some commonly used functions:


* Mathematics: `np.sum`, `np.exp`, `np.round`, arithemtic operators 
* Array manipulation: `np.reshape`, `np.stack`, `np.concatenate`, `np.split`
* Linear Algebra: `np.matmul`, `np.dot`, `np.transpose`, `np.eigvals`
* Statistics: `np.mean`, `np.median`, `np.std`, `np.max`

> **How to find the function you need?** The easiest way to find the right function for a specific operation or use-case is to do a web search. For instance, searching for "How to join numpy arrays" leads to [this tutorial on array concatenation](https://cmdlinetips.com/2018/04/how-to-concatenate-arrays-in-numpy/). 

You can find a full list of array functions here: https://numpy.org/doc/stable/reference/routines.html

## Arithmetic operations, broadcasting and comparison

Numpy arrays support arithmetic operators like `+`, `-`, `*`, etc. You can perform an arithmetic operation with a single number (also called scalar) or with another array of the same shape. Operators make it easy to write mathematical expressions with multi-dimensional arrays.

In [54]:
arr2 = np.array([[1, 2, 3, 4], 
                 [5, 6, 7, 8], 
                 [9, 1, 2, 3]])

In [55]:
arr3 = np.array([[11, 12, 13, 14], 
                 [15, 16, 17, 18], 
                 [19, 11, 12, 13]])

In [56]:
# Adding a scalar
arr2 + 3

array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12,  4,  5,  6]])

In [58]:
# Element-wise subtraction
arr3 - arr2

array([[10, 10, 10, 10],
       [10, 10, 10, 10],
       [10, 10, 10, 10]])

In [59]:
# Division by scalar
arr2 / 2

array([[0.5, 1. , 1.5, 2. ],
       [2.5, 3. , 3.5, 4. ],
       [4.5, 0.5, 1. , 1.5]])

In [60]:
# Element-wise multiplication
arr2 * arr3

array([[ 11,  24,  39,  56],
       [ 75,  96, 119, 144],
       [171,  11,  24,  39]])

In [61]:
# Modulus with scalar
arr2 % 4

array([[1, 2, 3, 0],
       [1, 2, 3, 0],
       [1, 1, 2, 3]])

## Array indexing and slicing

Numpy extends Python's list indexing notation using `[]` to multiple dimensions in an intuitive fashion. You can provide a comma-separated list of indices or ranges to select a specific element or a subarray (also called a slice) from a Numpy array.

In [62]:
arr3 = np.array([
    [[11, 12, 13, 14], 
     [13, 14, 15, 19]], 
    
    [[15, 16, 17, 21], 
     [63, 92, 36, 18]], 
    
    [[98, 32, 81, 23],      
     [17, 18, 19.5, 43]]])

In [63]:
arr3.shape

(3, 2, 4)

In [65]:
arr3[1,1,3]

18.0

In [64]:
# Single element
arr3[1, 1, 2]

36.0

In [None]:
array_name[:,:,:]

In [69]:
# Subarray using ranges
arr3[::1, ::1, ::1]

array([[[11. , 12. , 13. , 14. ],
        [13. , 14. , 15. , 19. ]],

       [[15. , 16. , 17. , 21. ],
        [63. , 92. , 36. , 18. ]],

       [[98. , 32. , 81. , 23. ],
        [17. , 18. , 19.5, 43. ]]])

In [70]:
# Mixing indices and ranges
arr3[1:, 1, 3]

array([18., 43.])

In [71]:
# Mixing indices and ranges
arr3[1:, 1, :3]

array([[63. , 92. , 36. ],
       [17. , 18. , 19.5]])

In [72]:
# Using fewer indices
arr3[1]

array([[15., 16., 17., 21.],
       [63., 92., 36., 18.]])

In [73]:
# Using fewer indices
arr3[:2, 1]

array([[13., 14., 15., 19.],
       [63., 92., 36., 18.]])

In [74]:
# Using too many indices
arr3[1,3,2,1]

IndexError: too many indices for array

## Other ways of creating Numpy arrays

Numpy also provides some handy functions to create arrays of desired shapes with fixed or random values. Check out the [official documentation](https://numpy.org/doc/stable/reference/routines.array-creation.html) or use the `help` function to learn more.

In [75]:
# All zeros
np.zeros((3, 2))

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

In [76]:
# All ones
np.ones([2, 2, 3])

array([[[1., 1., 1.],
        [1., 1., 1.]],

       [[1., 1., 1.],
        [1., 1., 1.]]])

In [77]:
# Identity matrix
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [78]:
# Random vector
np.random.rand(5)

array([0.34336648, 0.21753935, 0.0045433 , 0.38447411, 0.7664056 ])

## Summary and Further Reading

With this, we complete our discussion of arrays in the python. We've covered the following topics in this tutorial:

- Going from Python lists to Numpy arrays
- Operating on Numpy arrays
- Benefits of using Numpy arrays over lists
- Multi-dimensional Numpy arrays
- Working with CSV data files
- Arithmetic operations
- Array indexing and slicing
- Other ways of creating Numpy arrays


Check out the following resources for learning more about Numpy:

- Official tutorial: https://numpy.org/devdocs/user/quickstart.html
- Numpy tutorial on W3Schools: https://www.w3schools.com/python/numpy_intro.asp
- Advanced Numpy (exploring the internals): http://scipy-lectures.org/advanced/advanced_numpy/index.html

You are ready to move on to the next tutorial: [Analyzing Tabular Data using Pandas](https://jovian.ai/aakashns/python-pandas-data-analysis).

## Reference:
* [A short course in Python for astronomers by Neal Jacksons](http://www.jb.man.ac.uk/~njj/plan.pdf)
* An execellent course for beginners on basics of python [Data analysis with python zero to pandas](https://jovian.ai/learn/data-analysis-with-python-zero-to-pandas)