# 💡 **Question 7 -**
Using the data from Question 4, write code to analyze the data and answer the following questions 

**Note -**
1. Draw plots to demonstrate the analysis for the following questions for better visualizations

2. Write code comments wherever required for code understanding

**Insights to be drawn -**

● Get all the Earth meteorites that fell before the year 2000

● Get all the earth meteorites co-ordinates who fell before the year 1970

● Assuming that the mass of the earth meteorites was in kg, get all those whose mass was more than 10000kg

# **Below code from Que No. 4 for data...**

In [3]:
import pandas as pd
import requests

def fetch_data(url):
    """
    Fetches data from the specified URL using the requests library.
    Returns the parsed JSON data if the response is successful (status code 200),
    otherwise raises an exception.
    """
    response = requests.get(url)
    if response.status_code == 200:
        return response.json()
    else:
        raise Exception("Error downloading data from URL: {}".format(url))
    
def read_data(data):
    """
    Extracts specific attributes from the data and returns them as a list of dictionaries.
    """
    attributes = []

    for meteorite in data:
        attributes.append({
            'Name of Earth Meteorite': meteorite.get('name', ''),
            'id': meteorite.get('id', ''),
            'nametype': meteorite.get('nametype', ''),
            'recclass': meteorite.get('recclass', ''),
            'mass': meteorite.get('mass', ''),
            'year': meteorite.get('year', ''),
            'reclat': meteorite.get('reclat', ''),
            'reclong': meteorite.get('reclong', ''),
            'coordinates': meteorite.get('geolocation', {}).get('coordinates', [])
        })

    return attributes



def convert_to_csv(df, filename):
    # Convert the DataFrame to a CSV file with the specified filename.
    # Set index=False to exclude the index column in the resulting CSV file.
    df.to_csv(filename, index=False)

def main():
    # Define the URL to fetch the data from.
    url = "https://data.nasa.gov/resource/y77d-th95.json"
    
    # Download the data from the URL using the dload_data function.
    data = fetch_data(url)

    # Read the relevant data attributes using the read_data function.
    attributes = read_data(data)
     
    # Convert the list of dictionaries to a DataFrame.
    df = pd.DataFrame(attributes)

    # Convert the DataFrame to a CSV file.
    convert_to_csv(df, "meteorite.csv")

if __name__ == "__main__":
    main()


Explanation:

The dload_data function fetches the data from the provided URL using the requests library.

It checks if the response status code is 200, indicating a successful request.

If the response is successful, it parses the response content as JSON using response.json() and returns the parsed data.

If the response status code is not 200, it raises an exception with an error message indicating the issue.

The read_data function iterates over the data and extracts the relevant attributes, storing them in a list of dictionaries.

The extracted attributes include information such as the name, ID, type, class, mass, year, latitude, longitude, and coordinates of the meteorites.

The convert_to_csv function takes a DataFrame (df) and a filename as input.

It uses the to_csv() method of the DataFrame to convert the data into a CSV file with the specified filename.

The index=False parameter is used to exclude the index column in the resulting CSV file.

The main function is the entry point of the program.

It defines the URL to fetch the data from.

It calls the dload_data function to download the data from the URL.

The downloaded data is then processed using the read_data function to extract the relevant attributes.

The list of dictionaries is converted to a DataFrame using pd.DataFrame().

Finally, the convert_to_csv function is called to convert the DataFrame into a CSV file with the filename "meteorite.csv".

# **Que 7 soln beased of que 3 data....**

In [4]:
import pandas as pd

meteorites_df = pd.read_csv('meteorite.csv')


In [5]:
meteorites_df.head()

Unnamed: 0,Name of Earth Meteorite,id,nametype,recclass,mass,year,reclat,reclong,coordinates
0,Aachen,1,Valid,L5,21.0,1880-01-01T00:00:00.000,50.775,6.08333,"[6.08333, 50.775]"
1,Aarhus,2,Valid,H6,720.0,1951-01-01T00:00:00.000,56.18333,10.23333,"[10.23333, 56.18333]"
2,Abee,6,Valid,EH4,107000.0,1952-01-01T00:00:00.000,54.21667,-113.0,"[-113, 54.21667]"
3,Acapulco,10,Valid,Acapulcoite,1914.0,1976-01-01T00:00:00.000,16.88333,-99.9,"[-99.9, 16.88333]"
4,Achiras,370,Valid,L6,780.0,1902-01-01T00:00:00.000,-33.16667,-64.95,"[-64.95, -33.16667]"


In [27]:
import pandas as pd
import plotly.express as px
import ast
import numpy as np

# Load the meteorite data from the CSV file
meteorites_df = pd.read_csv('meteorite.csv')

# Get all the Earth meteorites that fell before the year 2000
meteorites_df["only_year"] = meteorites_df["year"].apply(lambda x: int(str(x).split("T")[0].split("-")[0]) if pd.notnull(x) else np.nan)
before_2000_meteorites = meteorites_df.loc[meteorites_df["only_year"] < 2000, "Name of Earth Meteorite"]
print("Earth meteorites that fell before the year 2000:")
print(before_2000_meteorites)

# Get the coordinates of Earth meteorites that fell before the year 1970
coordinates = meteorites_df.loc[meteorites_df["only_year"] < 1970, "coordinates"].dropna().apply(ast.literal_eval)
print("Coordinates of Earth meteorites that fell before the year 1970:")
print(coordinates)

# Get all the Earth meteorites with a mass greater than 10,000 kg
meteorites_df["mass_kg"] = meteorites_df["mass"] / 1000  # Convert mass from grams to kilograms
mass_greater_than_10000kg = meteorites_df.loc[meteorites_df["mass_kg"] > 10000, "Name of Earth Meteorite"]
print("Earth meteorites with a mass greater than 10,000 kg:")
print(mass_greater_than_10000kg)

# Plot the number of Earth meteorites by year
year_counts = meteorites_df.groupby("only_year").size().reset_index(name="counts")
fig1 = px.bar(year_counts, x="only_year", y="counts", title="Number of Earth Meteorites by Year")
fig1.show()

# Plot the locations of Earth meteorites on a world map
fig2 = px.scatter_geo(meteorites_df, lat="reclat", lon="reclong", hover_name="Name of Earth Meteorite",
                      title="Locations of Earth Meteorites")
fig2.show()

# Plot a histogram of the masses of Earth meteorites
fig3 = px.histogram(meteorites_df, x="mass_kg", nbins=30, title="Mass Distribution of Earth Meteorites")
fig3.show()


Earth meteorites that fell before the year 2000:
0         Aachen
1         Aarhus
2           Abee
3       Acapulco
4        Achiras
         ...    
994     Timochin
995     Tirupati
997        Tjabe
998     Tjerebon
999    Tomakovka
Name: Name of Earth Meteorite, Length: 929, dtype: object
Coordinates of Earth meteorites that fell before the year 1970:
0          [6.08333, 50.775]
1       [10.23333, 56.18333]
2           [-113, 54.21667]
4        [-64.95, -33.16667]
5               [71.8, 32.1]
               ...          
994             [35.2, 54.5]
995     [79.41667, 13.63333]
997    [111.53333, -7.08333]
998    [106.58333, -6.66667]
999        [34.76667, 47.85]
Name: coordinates, Length: 780, dtype: object
Earth meteorites with a mass greater than 10,000 kg:
920    Sikhote-Alin
Name: Name of Earth Meteorite, dtype: object
