<h1>Collect Weather Data Using OpenWeather API</h1>

This **R** notebook collects real-time current and forecasted weather data for cities using the **OpenWeather API** (which provides current data and 5-day forcasts), using a REST API.

## Requirements
- Free [OpenWeather](https://home.openweathermap.org/) API account
- `httr` and `tidyverse`

go to Account -> My API Keys. This is needed to authenticate HTTP requests to OpenWeather API.


<a href="https://cognitiveclass.ai/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkRP0321ENSkillsNetwork878-2022-01-01">
    <img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0321EN-SkillsNetwork/labs/module_1/images/l2-openweather-apikey.png" width="400" align="center">
</a>

<a href="https://cognitiveclass.ai/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkRP0321ENSkillsNetwork878-2022-01-01">
    <img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0321EN-SkillsNetwork/labs/module_1/images/l2-openweather-apikey-value.png" width="400" align="center">
</a>

## API Test
Replace `{your_api_key}` with the actual API key and go to:

https://api.openweathermap.org/data/2.5/weather?q=Seoul&appid={api_key}



This should return JSON weather data (instead of 401 or other error status), similar to the following:

```JSON
{"coord":{"lon":126.9778,"lat":37.5683},
"weather":[{"id":800,"main":"Clear","description":"clear sky","icon":"01n"}],
"base":"stations",
"main":{"temp":285.16,"feels_like":284.04,"temp_min":284.15,"temp_max":287.15,"pressure":1020,"humidity":62},
"visibility":10000,
"wind":{"speed":1.03,"deg":220},"clouds":{"all":0},"dt":1617718307,"sys":{"type":1,"id":8105,"country":"KR","sunrise":1617657021,"sunset":1617703103},"timezone":32400,"id":1835848,"name":"Seoul","cod":200}
```

## Get Current Weather Data for Seoul using R
- Download current weather using the `httr` library.
- The API base URL to get current weather is https://api.openweathermap.org/data/2.5/weather

In [1]:
# rvest library might need to be installed
require("httr")

library(httr)
library(tidyverse)

Loading required package: httr
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
✔ ggplot2 3.3.0     ✔ purrr   0.3.4
✔ tibble  3.0.1     ✔ dplyr   0.8.5
✔ tidyr   1.0.2     ✔ stringr 1.4.0
✔ readr   1.3.1     ✔ forcats 0.5.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()


In [2]:
# URL for Current Weather API
current_weather_url <- 'https://api.openweathermap.org/data/2.5/weather'

Create a list to hold URL parameters for the current weather API


In [1]:
# replaced with actual API key
api_key <- "api_key"
# q: city name
# appid: API KEY, 
# units: preferred units (metric or imperial)
current_query <- list(q = "Seoul", appid = api_key, units="metric")

Make a HTTP request to the current weather API.

In [1]:
response <- GET(current_weather_url, query=current_query)
response

ERROR: Error in GET(current_weather_url, query = current_query): could not find function "GET"


The response type ahould be JSON.

In [5]:
http_type(response)

Parse the JSON response as a named list.

In [6]:
json_result <- content(response, as="parsed")

In [7]:
class(json_result)

Display the JSON result.


In [8]:
json_result

The JSON result needs to be deserialised for analysis.
Convert the list to a dataframe.

In [9]:
# Create some empty vectors to hold data temporarily
weather <- c()
visibility <- c()
temp <- c()
temp_min <- c()
temp_max <- c()
pressure <- c()
humidity <- c()
wind_speed <- c()
wind_deg <- c()

Now assign the values in the `json_result` list to individual vectors

In [10]:
# $weather is also a list with one element, its $main element indicates the weather status such as clear or rain
weather <- c(weather, json_result$weather[[1]]$main)
# Get Visibility
visibility <- c(visibility, json_result$visibility)
# Get current temperature 
temp <- c(temp, json_result$main$temp)
# Get min temperature 
temp_min <- c(temp_min, json_result$main$temp_min)
# Get max temperature 
temp_max <- c(temp_max, json_result$main$temp_max)
# Get pressure
pressure <- c(pressure, json_result$main$pressure)
# Get humidity
humidity <- c(humidity, json_result$main$humidity)
# Get wind speed
wind_speed <- c(wind_speed, json_result$wind$speed)
# Get wind direction
wind_deg <- c(wind_deg, json_result$wind$deg)


Combine all vectors into dataframe columns.

In [11]:
# Combine all vectors
weather_data_frame <- data.frame(weather=weather, 
                                 visibility=visibility, 
                                 temp=temp, 
                                 temp_min=temp_min, 
                                 temp_max=temp_max, 
                                 pressure=pressure, 
                                 humidity=humidity, 
                                 wind_speed=wind_speed, 
                                 wind_deg=wind_deg)

In [12]:
# Check the generated data frame
print(weather_data_frame)

  weather visibility   temp temp_min temp_max pressure humidity wind_speed
1   Clear      10000 -17.34   -17.34   -15.31     1032       33       3.41
  wind_deg
1      305


## Get 5-day weather forecasts for a list of cities using the OpenWeather API
- Write a function to return a dataframe containing 5-day weather forecasts for a list of cities.
- Use "Seoul", "Washington, D.C.", "Paris" and "Suzhou" as examples.

In [13]:
# Create empty vectors to hold data temporarily

# City name column
city <- c()
# Weather column, rainy or cloudy, etc
weather <- c()
# Sky visibility column
visibility <- c()
# Current temperature column
temp <- c()
# Max temperature column
temp_min <- c()
# Min temperature column
temp_max <- c()
# Pressure column
pressure <- c()
# Humidity column
humidity <- c()
# Wind speed column
wind_speed <- c()
# Wind direction column
wind_deg <- c()
# Forecast timestamp
forecast_datetime <- c()

# Season column
# For season, a season value from levels Spring, Summer, Autumn, and Winter will be used based on the current month.
current_month <- c()
season <- c()

In [14]:
get_weather_forecaset_by_cities <- function(city_names){
    
    # Get forecast data for a given city list

    df <- data.frame()
    for (city_name in city_names){
        # Forecast API URL
        forecast_url <- 'https://api.openweathermap.org/data/2.5/forecast'
        # Create query parameters
        forecast_query <- list(q = city_name, appid = api_key, units="metric")
        # Make HTTP GET call for the given city
        json_list <- GET(forecast_url, query = forecast_query) %>% content(as = "parsed")
                
        # Note that the 5-day forecast JSON result is a list of lists. You can print the reponse to check the results
        results <- json_list$list
        
        # Loop the json result
        for(result in results) {
            city <- c(city, city_name)
            
            weather <- c(weather, result$weather[[1]]$main)
            visibility <- c(visibility, result$visibility)
            temp <- c(temp, result$main$temp)
            temp_min <- c(temp_min, result$main$temp_min)
            temp_max <- c(temp_max, result$main$temp_max)
            pressure <- c(pressure, result$main$pressure)
            humidity <- c(humidity, result$main$humidity)
            wind_speed <- c(wind_speed, result$wind$speed)
            wind_deg <- c(wind_deg, result$wind$deg)
            
            # Get datetime
            forecast_datetime <- c(forecast_datetime , result$dt_txt)

            # Get Season such as Spring, Summer, Autumn, and Winter based on your current month
            current_month <- c(current_month, result$dt_txt)
            current_month <-as.POSIXct(current_month, format = "%Y-%m-%d %H:%M:%S")
            current_month <-format(current_month, format = "%m")
            current_month <- as.integer(current_month)

            # Conditions by season of the month 
            ifelse(current_month==12 |1 |2,season<-"Winter", season)
            ifelse(current_month>2 & current_month <=5,season<-"Spring", season)
            ifelse(current_month>5 & current_month <=8,season<-"Summer", season)
            ifelse(current_month>8 & current_month <=11,season<-"Autumn", season)
            # Consider adding logic to account for season reversal by hemisphere
        }
        
        # Add the R lists to a dataframe
        df <- data.frame(city = city,
                         weather = weather,
                         visibility = visibility,
                         temp = temp,
                         temp_min = temp_min,
                         temp_max = temp_max,
                         pressure = pressure,
                         humidity = humidity,
                         wind_speed = wind_speed,
                         wind_deg = wind_deg,
                         forecast_datetime = forecast_datetime,
                         season = season)
    }
    
    # Return a data frame
    return(df)
    
}

Call `get_weather_forecaset_by_cities`. Write the dataframe to a csv file called `cities_weather_forecast.csv`

In [15]:
cities <- c("Seoul", "Washington, D.C.", "Paris", "Suzhou")
cities_weather_df <- get_weather_forecaset_by_cities(cities)
cities_weather_df

city,weather,visibility,temp,temp_min,temp_max,pressure,humidity,wind_speed,wind_deg,forecast_datetime,season
<fct>,<fct>,<int>,<dbl>,<dbl>,<dbl>,<int>,<int>,<dbl>,<int>,<fct>,<fct>
Seoul,Clear,10000,-16.38,-16.38,-14.47,1033,32,2.31,310,2023-01-24 21:00:00,Winter
Seoul,Clear,10000,-14.73,-14.73,-13.42,1034,28,1.71,302,2023-01-25 00:00:00,Winter
Seoul,Clear,10000,-9.84,-9.84,-9.84,1033,17,1.66,325,2023-01-25 03:00:00,Winter
Seoul,Clear,10000,-7.17,-7.17,-7.17,1030,17,2.09,302,2023-01-25 06:00:00,Winter
Seoul,Clear,10000,-6.89,-6.89,-6.89,1029,26,0.96,315,2023-01-25 09:00:00,Winter
Seoul,Clear,10000,-6.68,-6.68,-6.68,1028,30,1.33,146,2023-01-25 12:00:00,Winter
Seoul,Clouds,10000,-6.47,-6.47,-6.47,1028,39,0.83,140,2023-01-25 15:00:00,Winter
Seoul,Clouds,10000,-6.39,-6.39,-6.39,1027,43,1.43,118,2023-01-25 18:00:00,Winter
Seoul,Clouds,10000,-6.42,-6.42,-6.42,1026,52,1.57,78,2023-01-25 21:00:00,Winter
Seoul,Clouds,10000,-6.20,-6.20,-6.20,1026,73,1.55,77,2023-01-26 00:00:00,Winter


In [16]:
# Write cities_weather_df to `cities_weather_forecast.csv`
write.csv(cities_weather_df, "cities_weather_forecast.csv", row.names=FALSE)

For more details about HTTP requests with `httr`, please refer to the previous HTTP request notebook here: 

[HTTP request in R](https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0101EN-Coursera/v2/M4_Working_With_Data/lab2_jupyter_http-request.ipynb)


## Download datasets as csv files from cloud storage
- As per project instructions, download aggregated datasets from cloud storage.
- These will subsequently be used for data wrangling.

In [17]:
# Download several datasets

# Download general city information such as name and locations
url <- "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0321EN-SkillsNetwork/labs/datasets/raw_worldcities.csv"
# download the file
download.file(url, destfile = "raw_worldcities.csv")

# Download a specific hourly bike sharing demand dataset for Seoul
url <- "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0321EN-SkillsNetwork/labs/datasets/raw_seoul_bike_sharing.csv"
# download the file
download.file(url, destfile = "raw_seoul_bike_sharing.csv")