#### Task 1: Few More Operations on a NumPy Array

Performing mathematical operations on a NumPy array is easier compared to a Python list.

Let's say you have a NumPy array with radii of 20 circles and want to compute the area of every circle. Then you can simply use the double-asterisk (`**`) operator on the NumPy array to square the values. Then multiply the NumPy array with `pi`.

**Note:** The area of a circle with the radius 
$r$ 
is 
$\pi r^{2}$.

In [None]:
import random
import numpy as np
#Array of 10 random radiis
radiis=np.array([1,2,3,4,5,6,7,8,9,10])
#calc area matrix!
area=np.pi*radiis** 2
print(area.reshape(10,1))

[[  3.14159265]
 [ 12.56637061]
 [ 28.27433388]
 [ 50.26548246]
 [ 78.53981634]
 [113.09733553]
 [153.93804003]
 [201.06192983]
 [254.46900494]
 [314.15926536]]


#### Task 2

The dataset contains the following variables:

1. `name`: The name of the place where a meteorite was found or observed.

2. `id`: A unique identifier for a meteorite.

3. `nametype`: One of the following:
    
    - `valid`: A typical meteorite.
    
    - `relict`: A meteorite that has been highly degraded by the weather on Earth.

4. `recclass`: The class of the meteorite; one of a large number of classes based on physical, chemical, and other characteristics. 

5. `mass:` The mass of the meteorite, in grams.

6. `fall`: Whether the meteorite was seen falling, or was discovered after its impact; one of the following:

    - `Fell`: The meteorite's fall was observed.
    
    - `Found`: The meteorite's fall was not observed.

7. `year`: The year the meteorite fell, or the year it was found (depending on the value of fell).

8. `reclat`: The latitude of the meteorite's landing.

9. `reclong`: The longitude of the meteorite's landing.

10. `GeoLocation`: A parentheses-enclose, comma-separated tuple that combines reclat and reclong.

#### Loading the Dataset

Dataset link: https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/whitehat-ds-datasets/meteorite-landings/meteorite-landings.csv


In [None]:
#project time!
# Import the necessary libraries for this class and create a DataFrame.

#### The `describe()` Function

In [None]:
import pandas as pd
df = pd.read_csv("https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/whitehat-ds-datasets/meteorite-landings/meteorite-landings.csv")
df.describe()

Unnamed: 0,id,mass,year,reclat,reclong
count,45716.0,45585.0,45428.0,38401.0,38401.0
mean,26889.735104,13278.08,1991.772189,-39.12258,61.074319
std,16860.68303,574988.9,27.181247,46.378511,80.647298
min,1.0,0.0,301.0,-87.36667,-165.43333
25%,12688.75,7.2,1987.0,-76.71424,0.0
50%,24261.5,32.6,1998.0,-71.5,35.66667
75%,40656.75,202.6,2003.0,0.0,157.16667
max,57458.0,60000000.0,2501.0,81.16667,354.47333


#### Slicing a DataFrame and the Ampersand (`&`) Logical Operator

**Syntax:** `data_frame[condition1 & condition2 & condition3 ... conditionN]`

Where `N` is the total number of conditions to be applied.

In [None]:
# Rows containing the year values less than 860 and greater than 2016.
import pandas as pd
df = pd.read_csv("https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/whitehat-ds-datasets/meteorite-landings/meteorite-landings.csv")
f_df = df[(df['year']<860)|(df['year']>2016)]
print(f_df)

                        name     id nametype           recclass   mass   fall  \
16356                 Havana  11857    Valid  Iron, IAB complex    NaN  Found   
30679  Northwest Africa 7701  57150    Valid                CK6   55.0  Found   
38188                     Ur  24125    Valid               Iron    NaN  Found   
38301        Wietrzno-Bobrka  24259    Valid               Iron  376.0  Found   

         year    reclat   reclong              GeoLocation  
16356   301.0  40.33333 -90.05000  (40.333330, -90.050000)  
30679  2101.0   0.00000   0.00000     (0.000000, 0.000000)  
38188  2501.0  30.90000  46.01667   (30.900000, 46.016670)  
38301   601.0  49.41667  21.70000   (49.416670, 21.700000)  


In [None]:
# Rows having the 'reclong' values greater than or equal to -180 degrees and less than or equal to 180 degrees.
import pandas as pd
df = pd.read_csv("https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/whitehat-ds-datasets/meteorite-landings/meteorite-landings.csv")
f = df[(df['reclong']>=-180)&(df['reclong']<=180)]
print(f)

             name     id nametype              recclass      mass   fall  \
0          Aachen      1    Valid                    L5      21.0   Fell   
1          Aarhus      2    Valid                    H6     720.0   Fell   
2            Abee      6    Valid                   EH4  107000.0   Fell   
3        Acapulco     10    Valid           Acapulcoite    1914.0   Fell   
4         Achiras    370    Valid                    L6     780.0   Fell   
...           ...    ...      ...                   ...       ...    ...   
45711  Zillah 002  31356    Valid               Eucrite     172.0  Found   
45712      Zinder  30409    Valid  Pallasite, ungrouped      46.0  Found   
45713        Zlin  30410    Valid                    H4       3.3  Found   
45714   Zubkovsky  31357    Valid                    L6    2167.0  Found   
45715  Zulu Queen  30414    Valid                  L3.7     200.0  Found   

         year    reclat    reclong               GeoLocation  
0      1880.0  50.77500 

In [None]:
#check whether DataFrame has missing values or not using the 'isnull()' function.
import pandas as pd
df = pd.read_csv("https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/whitehat-ds-datasets/meteorite-landings/meteorite-landings.csv")
df.isnull()

Unnamed: 0,name,id,nametype,recclass,mass,fall,year,reclat,reclong,GeoLocation
0,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...
45711,False,False,False,False,False,False,False,False,False,False
45712,False,False,False,False,False,False,False,False,False,False
45713,False,False,False,False,False,False,False,False,False,False
45714,False,False,False,False,False,False,False,False,False,False


In [None]:
#check DataFrame has missing values or not using the 'isna()' function.
import pandas as pd
df = pd.read_csv("https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/whitehat-ds-datasets/meteorite-landings/meteorite-landings.csv")
df.isna()

Unnamed: 0,name,id,nametype,recclass,mass,fall,year,reclat,reclong,GeoLocation
0,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...
45711,False,False,False,False,False,False,False,False,False,False
45712,False,False,False,False,False,False,False,False,False,False
45713,False,False,False,False,False,False,False,False,False,False
45714,False,False,False,False,False,False,False,False,False,False


In [None]:
# Retrieve all the rows containing the missing 'mass' values in the 'correct_lat_long_df' DataFrame.
import pandas as pd
import numpy as np
correct_lat_long_df= pd.read_csv("https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/whitehat-ds-datasets/meteorite-landings/meteorite-landings.csv")
g = correct_lat_long_df[correct_lat_long_df['mass'].isnull()]
print(g)

                    name     id nametype      recclass  mass   fall    year  \
12       Aire-sur-la-Lys    425    Valid       Unknown   NaN   Fell  1769.0   
38                Angers   2301    Valid            L6   NaN   Fell  1822.0   
76     Barcelona (stone)   4944    Valid            OC   NaN   Fell  1704.0   
93              Belville   5009    Valid            OC   NaN   Fell  1937.0   
172    Castel Berardenga   5292    Valid    Stone-uncl   NaN   Fell  1791.0   
...                  ...    ...      ...           ...   ...    ...     ...   
38275     Wei-hui-fu (a)  24231    Valid          Iron   NaN  Found  1931.0   
38276     Wei-hui-fu (b)  24232    Valid          Iron   NaN  Found  1931.0   
38278            Weiyuan  24233    Valid  Mesosiderite   NaN  Found  1978.0   
41460      Yamato 792768  28117    Valid           CM2   NaN  Found  1979.0   
45698      Zapata County  30393    Valid          Iron   NaN  Found  1930.0   

         reclat    reclong               GeoLocatio

                                                 Numpy VS Pandas vs List!

1)When you just want to store data, retrieve data, and add more data, use a Python list.

2)When you want to store numerical data (one-dimensional or multidimensional) and want to perform a lot of mathematical operations, then use a NumPy array as it faster than a Python list and it is easy to create a multidimensional array using a NumPy array.

3)When you want to import data from an external file such as TXT, XLXS, CSV, XML, etc. then use a Pandas series. Additionally, Pandas allow you to interpret data in different ways. It also allows you to do complicated data extraction, manipulation, and data processing operations on a dataset. Throughout this gdsc 
bootcamp, we will use the Pandas library to handle data.