# Agenda:

* Numpy Creation From Existing Data with examples
   * From List and Tuple
   * From Text Files.

# Numpy Creation From Existing Data:

* Numpy library has numpy.array() and np.asarray() functions. They take list/tuple as input and creates numpy array.

## From List and Tuple:
### np.array():
* Parameter: Object<br/>
* Returns: ndarray 


* Creates numpy array from  given Object(list/tuple)

**Creating numpy using list**
* Create a list with some values and assign list to np.array() function.<br/>

In [4]:
import numpy as np
list_a = [2,3,4,4,5,1]
np.array(list_a)

array([2, 3, 4, 4, 5, 1])

**Creating numpy using tuple**
* Create a tuple with some values and assign list to np.array() function.<br/>

In [8]:
tuple_a = (2,3,4,4,5,1)
np.array(tuple_a)

array([2, 3, 4, 4, 5, 1])

* In above examples np.array() has taken list or tuple data as input and converted it to array.<br/>
* In the next example we will see how to provide dtype information to function np.array()

**Passing dtype as parameter to numpy array**

In [8]:
np.array(list_a,dtype=np.float16)

array([ 2.,  3.,  4.,  4.,  5.,  1.], dtype=float16)

In [17]:
np.array(tuple_a,dtype=np.float16)

array([ 2.,  3.,  4.,  4.,  5.,  1.], dtype=float16)

* If we don't specify the dtype it takes default dtype(int32).<br/>
* Till now we have seen all examples using numerical data. In the next example we will use string data.<br/>

**Numpy array creation using list of string data**

In [12]:
np.array(['hello','world','friends'])

array(['hello', 'world', 'friends'],
      dtype='<U7')

**Numpy array creation using tuple of string data**

In [14]:
np.array(('hello','world','friends'))

array(['hello', 'world', 'friends'],
      dtype='<U7')

* In text arrays size of each element is decided by the maximum size of word.<br/>
* In above example longest word is 'friends' and its length(no. of chars) is 7, np.array has considered this length(7) as limit for each string in the array.

### np.asarray:
* Parameters: lists, lists of tuples, tuples, tuples of tuples, tuples of lists and ndarrays.
* Result: ndarray

* np.asarray() is works similar to np.array, but has fewer parameters.<br/>
* But the main difference is that np.array will make a copy of the object(by default), while asarray will not unless necessary.<br/>
* It can be understood easily from the example mentioned in this link:<a href="https://stackoverflow.com/questions/14415741/numpy-array-vs-asarray/41030256#41030256file">Numpy - array vs asarray</a>

**Numpy creation using asarray from list object**

In [12]:
np.asarray(a=list_a,dtype=int)

array([2, 3, 4, 4, 5, 1])

**Numpy creation using asarray from tuple object**

In [9]:
np.asarray(a=tuple_a,dtype=int)

array([2, 3, 4, 4, 5, 1])

**Numpy creation using tuples **

In [11]:
tuples = ((1,4,6,7),(5,8,13,14),(0,1,5,1))
np.asarray(tuples)

array([[ 1,  4,  6,  7],
       [ 5,  8, 13, 14],
       [ 0,  1,  5,  1]])

* Numpy Library provides functions np.loadtxt(), np.genfromtxt() to read data from text files.

## From file:
### np.loadtxt()
* Parameters: file, str, pathlib.Path, list of str, generator
* Result: ndarray


* np.loadtxt() loads data from a text file.<br/>
* Note that each row in the text file must have the same number of values.

**Numpy creation from text file using loadtxt**<br/>
<img src="example11.PNG">

In [15]:
np.loadtxt('example1.txt')

array([ 1.,  2.,  3.,  4.,  5.,  6.])

* Default dtype and delimiter of np.loadtxt are float,space(' ') respectively.<br/>
* Next Example illustrates how to pass dtype info and gives you an idea how to sepatrate values in text file using delimiter.
* If data contains multiple data types we need to specify dtype information.

**Numpy creation from textfile which contains both string and numerical data using loadtxt**<br/>

<img src="example2.PNG">

In [20]:
np.loadtxt('example2.txt',dtype='U',delimiter=',')

array([['Name', ' Age', 'Sex '],
       ['John', ' 23', ' M  '],
       ['Ram', ' 30', ' M  '],
       ['Ali', '29', ' M  '],
       ['Seeta', ' 20', ' F '],
       ['julia', ' 18', ' F '],
       ['Shami', ' 35', ' F '],
       ['End', ' End', ' End ']],
      dtype='<U5')

* Exmaple2.txt containing both numerical and text data. So, prefered Unicode string as dtype.
* Below examples illustrates how to skip comments,specified rows and columns information.<br/>

**Numpy creation using loadtxt method with comments and skiprows arguments**
<img src="exmaple3.PNG">

* Comments argument skip the chars which appears after '//.<br/>
* By default np.loadtxt excludes text after '#' if we don't provide value to comments argument.<br/>
* Skiprows is useful to ignore the unimportant rows in text file.

In [140]:
np.loadtxt('example3.txt',dtype='U34',delimiter=',',comments='//',skiprows=3)

array([['Dhoni', '29', 'M  '],
       ['Virat', '20', 'F '],
       ['mithaali', '18', 'F '],
       ['Shami', '35', 'F '],
       ['End', 'End', 'End ']],
      dtype='<U34')

* To get only important columns from text file use usecols argument. 
* See below code to know how to pass columns information.

**Usage of numpy usecols param**

In [142]:
np.loadtxt('example3.txt',dtype='U16',delimiter=',',comments='//',usecols=(1,2))

array([['Age', 'Sex '],
       ['23', 'M  '],
       ['30', 'M  '],
       ['29', 'M  '],
       ['20', 'F '],
       ['18', 'F '],
       ['35', 'F '],
       ['End', 'End ']],
      dtype='<U16')

### np.genfromfile():
* Parameters: file, str, pathlib.Path, list of str, generator
* Result: ndarray

* Like np.loadtxt, np.genfromtxt also loads data from text files and archives((currently,it recognizes only gzip,bzip2).<br/>
* np.genfromtxt completes tasks in two main loops. First loop converts lines in text file to sequence of strings.
  The second one converts each string to the appropriate data type.<br/>   
* Since it is using two loops to complete a task, it is relatively slower than single loop method.<br/>
* It is a flexible method to handle missing values.<br/>
* Note:np.loadtxt cann't handle missing values.

**Numpy creation using genfromtext method**

In [151]:
np.genfromtxt(fname='example1.txt')

array([ 1.,  2.,  3.,  4.,  5.,  6.])

** In np.genfromtxt If dtype is None, then dtypes will be determined by the contents of the data itself**

In [27]:
np.genfromtxt(fname='example2.txt',dtype=None,delimiter=',')

array([[b'Name', b' Age', b'Sex'],
       [b'John', b' 23', b' M'],
       [b'Ram', b' 30', b' M'],
       [b'Ali', b'29', b' M'],
       [b'Seeta', b' 20', b' F'],
       [b'julia', b' 18', b' F'],
       [b'Shami', b' 35', b' F'],
       [b'End', b' End', b' End']],
      dtype='|S5')

**Numpy creation from textfile which contains both string and numerical data using genfromtxt**

In [24]:
np.genfromtxt(fname='example2.txt',dtype='U16',delimiter=',')

array([['Name', ' Age', 'Sex'],
       ['John', ' 23', ' M'],
       ['Ram', ' 30', ' M'],
       ['Ali', '29', ' M'],
       ['Seeta', ' 20', ' F'],
       ['julia', ' 18', ' F'],
       ['Shami', ' 35', ' F'],
       ['End', ' End', ' End']],
      dtype='<U16')

**Example to understand comments,skip_header,skip_footer of genfromtxt**

In [36]:
np.genfromtxt(fname='example3.txt',dtype='U34',delimiter=',')

array([['Name', 'Age', 'Sex //this is header'],
       ['Hpandey', '23', 'M'],
       ['KPandey', '30', 'M'],
       ['Dhoni', '29', 'M'],
       ['Virat', '20', 'F'],
       ['mithaali', '18', 'F'],
       ['Shami', '35', 'F'],
       ['End', 'End', 'End // this is footer']],
      dtype='<U34')

* If you observe above code didn't skip the comments while creating numpy array.
* To Remove comments,header andfooter information use comments,skip_header and skip_footer arguments respectively.

In [146]:
np.genfromtxt(fname='example3.txt',dtype='U34',delimiter=',',skip_header=True,skip_footer=True,comments='//')

array([['Hpandey', '23', 'M'],
       ['KPandey', '30', 'M'],
       ['Dhoni', '29', 'M'],
       ['Virat', '20', 'F'],
       ['mithaali', '18', 'F'],
       ['Shami', '35', 'F']],
      dtype='<U34')

* In above example , genfromtxt skipped header data(Name,Age,Sex) and footer data(End,End,End) using skip_header and skip_footer 
* Observe  above output, it removed comments as well.

** Genfromtxt missing values handling**

* As mentioned that genfromtxt is capable of handling missing values.<br/>

<img src="exmaple4.PNG">

**Creating numpy array from textfile which has missing values**

In [178]:
np.genfromtxt(fname='example4.txt',delimiter=',')

array([[  1.,   2.,   3.,  nan,   5.],
       [  6.,   7.,  nan,   9.,  10.],
       [ 11.,  12.,  nan,  14.,  15.]])

* It has an argument filling_values, with help of it we can set default value to missing values.
* Below exmaple replaces missing values with 0.

**Replacing missing values with default value**

In [181]:
default_data = '0'
np.genfromtxt(fname='example4.txt',delimiter=',',filling_values=default_data)

array([[  1.,   2.,   3.,   0.,   5.],
       [  6.,   7.,   0.,   9.,  10.],
       [ 11.,  12.,   0.,  14.,  15.]])

**Numpy creation from byte stream**

* This example gives you an idea on how to create byte stream for given string.

In [39]:
from io import BytesIO
data = BytesIO("2,3,1\n4,1,1".encode('utf8'))
np.genfromtxt(data,delimiter=',')

array([[ 2.,  3.,  1.],
       [ 4.,  1.,  1.]])

### Conclusion:
In this chanpter we have seen numpy creation using np.array, np.asarray, np.loadtxt,np.genfromtxt functions.<br/>
In the next chanpter we will learn  Numpy Accessing Elements Techniques.