# Numpy Creation From Existing Data:

In previous chapter we have seen how to initialize numpy using various techniques. In this Numpy chapter we are going to see how to create numpy using existing data.<br/>
Let's begin with numpy creation using list and tuple information.<br/> 
Numpy library has numpy.array() and np.asarray() functions. They take list/tuple as input and creates numpy array.
From List and Tuple.
### From List and Tuple:
#### np.array():
* Parameter: Object<br/>
* Returns: ndarray 

First import numpy library as np using below import statement.

In [3]:
import numpy as np

Then, Create a list with some values and assign list to np.array() function.<br/>
see below code:

**Example 1: creating numpy using list**

In [5]:
list_a = [2,3,4,4,5,1]
np.array(list_a)

array([2, 3, 4, 4, 5, 1])

In above example we have created numpy array with list data. In the next example we will create numpy array with tuple data.

** Example 2: creating numpy using tuple**

In [8]:
tuple_a = (2,3,4,4,5,1)
np.array(tuple_a)

array([2, 3, 4, 4, 5, 1])

In above examples np.array() has taken list or tuple data as input and converted it to array.<br/>
In the next example we will see how to provide dtype information to function np.array()

** Example 3: passing dtype as parameter to numpy array**

In [8]:
np.array(list_a,dtype=np.float16)

array([ 2.,  3.,  4.,  4.,  5.,  1.], dtype=float16)

In [17]:
np.array(tuple_a,dtype=np.float16)

array([ 2.,  3.,  4.,  4.,  5.,  1.], dtype=float16)

If we don't specify the dtype it takes default dtype(int32).<br/>
If you observe in Example 1 and Example 2  we didn't specify the dtype,In those cases np.array() function has taken default dtype int32.<br/>

Till now we have seen all examples using numerical data. In the next example we will use string data.<br/>
check below code:

**Example 4: numpy array creation using list of string data**

In [12]:
np.array(['hello','world','friends'])

array(['hello', 'world', 'friends'],
      dtype='<U7')

In [14]:
np.array(('hello','world','friends'))

array(['hello', 'world', 'friends'],
      dtype='<U7')

In above string array examples, size of each element is decided by the maximum size of word.<br/>
Let's understand above sentence meaning using previous example. In above example longest word is 'friends' and its length(no. of chars) is 7.<br/> np.array considered this length as limit for each string in the np.array.

Now we have undestanding of how to create numpy from list/tuple using np.array.<br/>
Next we will use np.asarray to create numpy from list/tuple

np.asarray() is works similar to np.array.<br/>
But the main difference is that np.array will make a copy of the object(by default), while asarray will not unless necessary.<br/>
It can be understood easily from the example mentioned in this link:<br/><div style = "direction:rtl"><a href="https://stackoverflow.com/questions/14415741/numpy-array-vs-asarray/41030256#41030256file">Numpy - array vs asarray</a></div>

#### np.asarray:
* Parameters: lists, lists of tuples, tuples, tuples of tuples, tuples of lists and ndarrays.
* Result: ndarray

Like previous examples in below example also I am going to pass list and tuple information to np.asarray.

**Example 5: numpy creation using asarray**

In [28]:
print('using list:\n',np.asarray(a=list_a,dtype=int))
print('using tuple:\n',np.asarray(a=tuple_a,dtype=int))

using list:
 [2 3 4 4 5 1]
using tuple:
 [2 3 4 4 5 1]


Let's pass a little more complex data (tuple of tuple) to np.asarray than before.

**Example 6: numpy creation using tuple of tuple data**

In [57]:
tuple_tuple = ((1,4,6,7),(5,8,13,14),(0,1,5,1))
np.asarray(tuple_tuple)

array([[ 1,  4,  6,  7],
       [ 5,  8, 13, 14],
       [ 0,  1,  5,  1]])

See above result it created array with tuple of tuple data considering each tuple as a row.

I hope at this point you have the idea how to create numpy array from list and tuple information.

Next we will understand how to create numpy using textfiles.<br/>
Numpy Library provides functions np.loadtxt(), np.genfromtxt() to read data from text files.

### From file:
#### np.loadtxt()
* Parameters: file, str, pathlib.Path, list of str, generator
* Result: ndarray

np.loadtxt() loads data from a text file.<br/>
Note that each row in the text file must have the same number of values.

See below example it has taken file name as input,then created a numpy from that data.

**Example 7: numpy creation from text file using loadtxt**

In [116]:
np.loadtxt('example1.txt')

array([ 1.,  2.,  3.,  4.,  5.,  6.])

Before moving to other example I want to clear you that default dtype of np.loadtxt is float.<br/>
Since my data file(example1.txt) is  having only float values,I didn't provided the dtype information.<br/>
If you include other data types in  your data file please provide appropriate dtype info using dtype='dtype which satisfy your data requirement'.

Next Example illustrates how to pass dtype info and gives you an idea how to sepatrate values in text file using delimiter.

**Example 8:numpy creation from textfile which contains both string and numerical data using loadtxt**

In [137]:
np.loadtxt('example2.txt',dtype='U',delimiter=',')

array([['Name', ' Age', 'Sex '],
       ['John', ' 23', ' M  '],
       ['Ram', ' 30', ' M  '],
       ['Ali', '29', ' M  '],
       ['Seeta', ' 20', ' F '],
       ['julia', ' 18', ' F '],
       ['Shami', ' 35', ' F '],
       ['End', ' End', ' End ']],
      dtype='<U5')

In above example, exmaple2.txt containing both numerical and text data, So I have prefered Unicode string as dtype.

Go through below examples to know how to skip comments,specified rows information and columns information.<br/>
Output displayed in below line is the content available in my text file example3.txt<br/>

**Example 9: numpy creation using loadtxt method with comments and skiprows arguments**

In [138]:
np.loadtxt('example3.txt',dtype='U34',delimiter=',')

array([['Name', 'Age', 'Sex //this is header'],
       ['Hpandey', '23', 'M  '],
       ['KPandey', '30', 'M  '],
       ['Dhoni', '29', 'M  '],
       ['Virat', '20', 'F '],
       ['mithaali', '18', 'F '],
       ['Shami', '35', 'F '],
       ['End', 'End', 'End // this is footer']],
      dtype='<U34')

I have used comments argument in below example to skip the chars which appears after '//.<br/>
By default np.loadtxt excludes text after '# if we don't provide value to comments argument.<br/>
Skiprows is useful to ignore the unimportant rows in text file.

In [140]:
np.loadtxt('example3.txt',dtype='U34',delimiter=',',comments='//',skiprows=3)

array([['Dhoni', '29', 'M  '],
       ['Virat', '20', 'F '],
       ['mithaali', '18', 'F '],
       ['Shami', '35', 'F '],
       ['End', 'End', 'End ']],
      dtype='<U34')

To get only important columns from text file use usecols argument. See below code to know how to pass columns information.

**Example 10: usage of numpy usecols param**

In [142]:
np.loadtxt('example3.txt',dtype='U16',delimiter=',',comments='//',usecols=(1,2))

array([['Age', 'Sex '],
       ['23', 'M  '],
       ['30', 'M  '],
       ['29', 'M  '],
       ['20', 'F '],
       ['18', 'F '],
       ['35', 'F '],
       ['End', 'End ']],
      dtype='<U16')

Like np.loadtxt, np.genfromtxt also loads data from text files and archives((currently,it recognizes only gzip,bzip2).<br/>
np.genfromtxt completes tasks in two main loops. First loop converts lines in text file to sequence of strings.<br/>
The second one converts each string to the appropriate data type.<br/>                                                            Since it is using two loops to complete a task, it is relatively slower than single loop method.<br/>
It is a flexible method to handle missing values.<br/>
Note:np.loadtxt cann't handle missing values.

### np.genfromfile():
* Parameters: file, str, pathlib.Path, list of str, generator
* Result: ndarray

**Example 11: numpy creation using genfromtext method**

In [151]:
np.genfromtxt(fname='example1.txt')

array([ 1.,  2.,  3.,  4.,  5.,  6.])

In np.genfromtxt If dtype is None, then dtypes will be determined by the contents of each column, individually.

**Example 12: numpy creation from textfile which contains both string and numerical data using genfromtxt**

This is a similar example to example 8  which we have seen in np.loadtxt().

In [154]:
np.genfromtxt(fname='example2.txt',dtype='U16',delimiter=',')

array([['Name', ' Age', 'Sex'],
       ['John', ' 23', ' M'],
       ['Ram', ' 30', ' M'],
       ['Ali', '29', ' M'],
       ['Seeta', ' 20', ' F'],
       ['julia', ' 18', ' F'],
       ['Shami', ' 35', ' F'],
       ['End', ' End', ' End']],
      dtype='<U16')

** Example 13: example to understand comments,skip_header,skip_footer of genfromtxt**

This is a similar example to example 9 which we have seen in np.loadtxt().

In [105]:
np.genfromtxt(fname='example3.txt',dtype='U34',delimiter=',')

array([['Name', 'Age', 'Sex //this is header'],
       ['Hpandey', '23', 'M'],
       ['KPandey', '30', 'M'],
       ['Dhoni', '29', 'M'],
       ['Virat', '20', 'F'],
       ['mithaali', '18', 'F'],
       ['Shami', '35', 'F'],
       ['End', 'End', 'End // this is footer']],
      dtype='<U34')

In [146]:
np.genfromtxt(fname='example3.txt',dtype='U34',delimiter=',',skip_header=True,skip_footer=True,comments='//')

array([['Hpandey', '23', 'M'],
       ['KPandey', '30', 'M'],
       ['Dhoni', '29', 'M'],
       ['Virat', '20', 'F'],
       ['mithaali', '18', 'F'],
       ['Shami', '35', 'F']],
      dtype='<U34')

In above example , genfromtxt skipped header data(Name,Age,Sex) and footer data(End,End,End) using skip_header and skip_footer arguments respectively.
Observe  above output it removed comment as well.

** Example 14: genfromtxt missing values handling**

While introducing np.genfromtext I have mentioned that genfromtxt is capable of handling missing values.<br/>
See below example output, it has replaced missing values with nan.

In [178]:
np.genfromtxt(fname='example4.txt',delimiter=',')

array([[  1.,   2.,   3.,  nan,   5.],
       [  6.,   7.,  nan,   9.,  10.],
       [ 11.,  12.,  nan,  14.,  15.]])

As mentioed, np.genromtxt is flexible to deal with missing data.It has an argument filling_values, with help of filling_values argument we can set default value to missing values.

Go through below lines to see how it assinged 0 to nan.

In [181]:
default_data = '0'
np.genfromtxt(fname='example4.txt',delimiter=',',filling_values=default_data)

array([[  1.,   2.,   3.,   0.,   5.],
       [  6.,   7.,   0.,   9.,  10.],
       [ 11.,  12.,   0.,  14.,  15.]])

**Example 15: numpy creation from byte stream**

Below example gives you an idea on how to create byte stream for given string.

In [189]:
from io import BytesIO
data = BytesIO("1,1,1\n1,1,1".encode('utf8'))
np.genfromtxt(data,delimiter=',')

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

Here,First we have imported the BytesIO from io.<br/>
Then created an byte stream using ByteIO function.<br/>
Finally, Passed the byte stream to np.gefromtext.<br/>
Note that generators must return byte strings in Python 3k.

### Summary:
In this chanpter we have seen numpy creation using np.array, np.asarray, np.loadtxt,np.genfromtxt functions.<br/>
In the next chanpter we will learn about numpy functions and methods.