**Example 2:** Read data from a simple text file containing space separated numbers in multiple lines. The count of numbers on each line must be same

In [7]:

arr = np.genfromtxt(r"C:\Users\PMLS\Desktop\SEM-5\Into to DS\Intro to DS\NumPy\Testing")
print("data:\n", arr)
print("shape: ", arr.shape)

data:
 [[ 1.  2.]
 [ 3.  4.]
 [ 5.  6.]
 [ 7.  8.]
 [ 9. 10.]]
shape:  (5, 2)


In [8]:
# You can read the numbers as integers, by mentioning the dtype argument
arr = np.genfromtxt(r"C:\Users\PMLS\Desktop\SEM-5\Into to DS\Intro to DS\NumPy\Testing", dtype=np.uint8)
print("data:\n", arr)
print("shape: ", arr.shape)

data:
 [[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]]
shape:  (5, 2)


**Example 3:** Read data from a csv text file containing comma separated numbers. By default, the `genfromtxt()` expect a space as separator. So here, we need to pass `,` as the delimiter argument

In [10]:
arr = np.genfromtxt("datasets/icecreamsales_simple.csv", dtype=np.int16, delimiter=',')
print("data:\n", arr)
print(arr.shape)

FileNotFoundError: datasets/icecreamsales_simple.csv not found.

**Example 4:** By default the `genfromtxt()` method assume that no column labels are there in the first line. However, if the first row of file contains column labels, we need to use skip_header argument

In [11]:
import numpy as np
arr = np.genfromtxt("datasets/icecreamsales_withheader.csv", dtype=np.int16, delimiter=',', skip_header=1)
print("data:\n", arr)
print(arr.shape)

**Example 5:** If the file has comments in the beginning, in between or at the end, you will get an error. To handle this, you need to pass the appropriate character that is used for start of comment to the comment argument

In [13]:
arr = np.genfromtxt("datasets/icecreamsales_withcomments.csv", dtype=np.int16, delimiter=',', comments='#')
print("data:\n", arr)
print(arr.shape)

In [14]:
arr1 = np.array([[1.5, 2.3, 3.7], [4.0, 5.2, 6.8],[7.1, 8.4, 9.3]])
np.savetxt('datasets/myarr.txt', arr1, fmt='%.2f')

In [16]:
arr2 = np.genfromtxt("datasets/myarr.txt")
arr2

**Example 2:** Create a NumPy array and then save it as a csv file. Finally verify by reading the file contents into a numPy array

In [17]:
arr1 = np.array([[1.5, 2.3, 3.7], [4.0, 5.2, 6.8],[7.1, 8.4, 9.3]])
np.savetxt('datasets/myarr.csv', arr1, fmt='%.2f', delimiter=',')

In [19]:
arr2 = np.genfromtxt("datasets/myarr.csv", usecols=[0, 1], delimiter=',')
arr2

## 3.  Bonus # 1
Visit `https://gist.github.com/arifpucit` and get the URL of public `climate.csv` file from this public GitHub gist, which contains 10,000 climate measurements (temperature, rainfall & humidity) in the following format: 

```
temperature,rainfall,humidity
25.00,76.00,99.00
39.00,65.00,70.00
59.00,45.00,77.00
84.00,63.00,38.00
66.00,50.00,52.00
41.00,94.00,77.00
91.00,57.00,96.00
49.00,96.00,99.00
67.00,20.00,28.00
...
```

Download the file and then read its data and compute the average of temperature, rainfall, and humidity values

- The `urllib.request.urlretrieve(url, filename=None)` method is used to retrieve a remote file into a temporary location on disk.
- Let us download `climate.csv` above mentioned github gist

>**The `urllib.request.urlopen()`, may return a URLError saying `SSL: CERTIFICATE_VERIFY_FAILED`. To handle this error set  the `_create_default_https_context` attribute of `ssl` to `_create_unverified_context`**

In [20]:
import ssl
ssl._create_default_https_context = ssl._create_unverified_context

In [12]:
import urllib
#Get the raw data url from your github gist account of a csv file named climate.csv
myurl = 'https://gist.githubusercontent.com/arifpucit/6e2d95002460db296506ec6f0cfb7008/raw/dae54a4e20d34e4b9622333fcccf04c441a250b7/climate.csv'

# Pass the url string and the path, where to save the file on local disk
urllib.request.urlretrieve(myurl, r'C:\Users\PMLS\Desktop\SEM-5\Into to DS\Intro to DS\NumPy\Testing')


('C:\\Users\\PMLS\\Desktop\\SEM-5\\Into to DS\\Intro to DS\\NumPy\\Testing',
 <http.client.HTTPMessage at 0x167aa14a720>)

In [23]:
import numpy as np
climate_data = np.genfromtxt("datasets/climate.csv", delimiter=',', skip_header=1)
print("Climate Data:\n", climate_data)
print(climate_data.shape)

In [24]:
# Slice data of the temperature column
climate_data[:,0]

In [25]:
# Slice data of the rainfall column
climate_data[:,1]

In [26]:
# Slice data of the humidity column
climate_data[:,2]

In [27]:
# Calculate the Mean of every column
print("Mean Temperature = ", climate_data[:,0].mean())
print("Mean Rainfall = ", climate_data[:,1].mean())
print("Mean Humidity = ", climate_data[:,2].mean())

>- Let us now create a fourth column, that is the sum obtained by matrix multiplication of climate_data and their corresponding hypothetical weights.

In [28]:
weights = np.array([0.3, 0.2, 0.5])
new_col = np.matmul(climate_data, weights)
new_col

In [29]:
new_col.shape

Let's add the `new_col` to `climate_data` as a fourth column using the `np.concatenate`
Since we wish to add new columns, i.e., horizontally concatenate, so we pass the argument `axis=1` to `np.concatenate`. The `axis` argument specifies the dimension for concatenation.

In [30]:
# First we need to reshape() the new_col to a 10000x1 matrix for concatenation
result_data = new_col.reshape(10000, 1)
result_data, result_data.shape

In [31]:
climate_results = np.concatenate((climate_data, result_data), axis=1)

In [32]:
climate_results

In [33]:
climate_results.shape

The results are written back in the CSV format to the file `climate_results.csv`. 

```
temperature,rainfall,humidity,col4
25.00,76.00,99.00,72.20
39.00,65.00,70.00,59.70
59.00,45.00,77.00,65.20
84.00,63.00,38.00,56.80
...
```



>- Let's write back the resulting numPy array `climate_results` in a new file `climate_results.csv` using the `np.savetxt` method.

In [34]:
np.savetxt('datasets/climate_results.csv', 
           climate_results, 
           fmt='%.2f', 
           delimiter=',',
           header='temperature,rainfall,humidity,col4', 
           comments='')

In [35]:
! cat datasets/climate_results.csv

## 4.  Bonus # 2
Now let us read an image file from disk and load it into a numPy array for image processing task

In [36]:
from PIL import Image

In [37]:
rgb_img = Image.open("datasets/speech.jpg")

In [38]:
rgb_img.mode

In [39]:
rgb_img.size

When translating a color image to greyscale (mode "L"), the library uses the ITU-R 601-2 luma transform::
```
    L = R * 299/1000 + G * 587/1000 + B * 114/1000
```

In [40]:
grey_img = rgb_img.convert('L')

In [41]:
grey_img.mode

In [42]:
grey_img.size

In [43]:
rgb_img

In [44]:
grey_img

#### Let us convert the two images to a NumPy array

In [45]:
rgb_img_array = np.array(rgb_img)

In [46]:
rgb_img_array.shape

In [47]:
rgb_img_array

In [48]:
grey_img_array = np.array(grey_img)

In [49]:
grey_img_array.shape

In [50]:
grey_img_array