We will be collecting real-time current and forecasted weather data for cities using the **OpenWeather API**. It can give current weather data for any location including over 200,000 cities and 5 day forecasts for free (with limited API usage). 



In [1]:
# Check if need to install rvest` library
require("httr")

library(httr)
library(zoo)

Loading required package: httr
"package 'zoo' was built under R version 3.6.3"
Attaching package: 'zoo'

The following objects are masked from 'package:base':

    as.Date, as.Date.numeric



The API base URL to get current weather is [https://api.openweathermap.org/data/2.5/weather](https://api.openweathermap.org/data/2.5/weather?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkRP0321ENSkillsNetwork25371262-2021-01-01)


In [2]:
# URL for Current Weather API
current_weather_url <- 'https://api.openweathermap.org/data/2.5/weather'

Next, let's create a list to hold URL parameters for current weather API


In [3]:
# need to be replaced by your real API key
your_api_key <- "b7d554a655748f172509b57d89094d44"
# Input `q` is the city name
# Input `appid` is your API KEY, 
# Input `units` are preferred units such as Metric or Imperial
current_query <- list(q = "Seoul", appid = your_api_key, units="metric")

Now we can make a HTTP request to the current weather API


In [4]:
response <- GET(current_weather_url, query=current_query)

If we check the response type, we can see it is in JSON format


In [5]:
http_type(response)

JSON is an open standard file and data interchange format that uses human-readable text to store and transmit data objects. To read the JSON HTTP response, you can use the `content()` function to parse it as a named list in R.


In [6]:
json_result <- content(response, as="parsed")

If you use the `class()` function, you can see it is a R `List` object


In [7]:
class(json_result)

Now let's print the JSON result.


In [8]:
json_result

It contains very detailed weather data about the city of `Seoul`. Feel free to try other cities as well. We need to convert the named list to a data frame so that we can use data frame operations to process the data. Below is a simple example, which you may implement your own way to convert it to a data frame.


In [9]:
# Create some empty vectors to hold data temporarily
weather <- c()
visibility <- c()
temp <- c()
temp_min <- c()
temp_max <- c()
pressure <- c()
humidity <- c()
wind_speed <- c()
wind_deg <- c()
datetime <- c()

Now assign the values in the `json_result` list into different vectors


In [10]:
# $weather is also a list with one element, its $main element indicates the weather status such as clear or rain
weather <- c(weather, json_result$weather[[1]]$main)
# Get Visibility
visibility <- c(visibility, json_result$visibility)
# Get current temperature 
temp <- c(temp, json_result$main$temp)
# Get min temperature 
temp_min <- c(temp_min, json_result$main$temp_min)
# Get max temperature 
temp_max <- c(temp_max, json_result$main$temp_max)
# Get pressure
pressure <- c(pressure, json_result$main$pressure)
# Get humidity
humidity <- c(humidity, json_result$main$humidity)
# Get wind speed
wind_speed <- c(wind_speed, json_result$wind$speed)
# Get wind direction
wind_deg <- c(wind_deg, json_result$wind$deg)
datetime <- c(datetime, json_result$dt)

Combine all vectors as columns of a data frame


In [11]:
# Combine all vectors
weather_data_frame <- data.frame(weather=weather, 
                                 visibility=visibility, 
                                 temp=temp, 
                                 temp_min=temp_min, 
                                 temp_max=temp_max, 
                                 pressure=pressure, 
                                 humidity=humidity, 
                                 wind_speed=wind_speed, 
                                 wind_deg=wind_deg,
                                 datetime=datetime)

In [12]:
# Check the generated data frame
print(weather_data_frame)

  weather visibility temp temp_min temp_max pressure humidity wind_speed
1  Clouds      10000 4.39     0.42     7.69     1015       70       4.63
  wind_deg   datetime
1      180 1638440772


#  Get 5-day weather forecasts for a list of cities using the OpenWeather API


A function to return a data frame containing 5-day weather forecasts for a list of cities


In [13]:
# Create some empty vectors to hold data temporarily

# City name column
city <- c()
# Weather column, rainy or cloudy, etc
weather <- c()
# Sky visibility column
visibility <- c()
# Current temperature column
temp <- c()
# Max temperature column
temp_min <- c()
# Min temperature column
temp_max <- c()
# Pressure column
pressure <- c()
# Humidity column
humidity <- c()
# Wind speed column
wind_speed <- c()
# Wind direction column
wind_deg <- c()
# Forecast timestamp
forecast_datetime <- c()
# Season column
# Note that for season, you can hard code a season value from levels Spring, Summer, Autumn, and Winter based on your current month.
season <- c()


In [14]:
# Get forecast data for a given city list
get_weather_forecaset_by_cities <- function(city_names){
    df <- data.frame()
    for (city_name in city_names){
        # Forecast API URL
        forecast_url <- 'https://api.openweathermap.org/data/2.5/forecast'
        # Create query parameters
        forecast_query <- list(q = city_name, appid = "b7d554a655748f172509b57d89094d44", units="metric")
        # Make HTTP GET call for the given city
        response <- GET(forecast_url, query=forecast_query)
        # Note that the 5-day forecast JSON result is a list of lists. You can print the reponse to check the results
        json_list <- content(response, as="parsed")
        results <- json_list$list
        # Loop the json result
        for(result in results) {
            city <- c(city, city_name)
            weather <- c(weather, result$weather[[1]]$main)
            visibility <- c(visibility, result$visibility) 
            temp <- c(temp, result$main$temp)   
            temp_min <- c(temp_min, result$main$temp_min) 
            temp_max <- c(temp_max, result$main$temp_max)
            pressure <- c(pressure, result$main$pressure)
            humidity <- c(humidity, result$main$humidity)
            wind_speed <- c(wind_speed, result$wind$speed)
            wind_deg <- c(wind_deg, result$wind$deg)
            forecast_datetime <- c(forecast_datetime, result$dt_txt)
            months <- as.numeric(format(as.Date(forecast_datetime), '%m'))
            index <- setNames(rep(c('winter','spring','summer','fall'),each = 3), c(12,1:11))
            season <- unname(index[as.character(months)])
        }
        df <- data.frame(city=city,
                                 weather=weather,
                                 visibility=visibility,
                                 temp=temp,
                                 temp_min=temp_min,
                                 temp_max=temp_max,
                                 pressure=pressure,
                                 humidity=humidity,
                                 wind_speed=wind_speed,
                                 wind_deg=wind_deg,
                                 forecast_datetime=forecast_datetime,
                                 season = season
                                )
        # Add the R Lists into a data frame
    }
    
    # Return a data frame
    return(df)
    
}


Complete and call `get_weather_forecaset_by_cities` function with a list of cities, and write the data frame into a csv file called `cities_weather_forecast.csv`


In [15]:
cities <- c("Seoul", "Washington, D.C.", "Paris", "Suzhou")
cities_weather_df <- get_weather_forecaset_by_cities(cities)

In [16]:
head(cities_weather_df)

city,weather,visibility,temp,temp_min,temp_max,pressure,humidity,wind_speed,wind_deg,forecast_datetime,season
Seoul,Rain,10000,5.14,5.14,7.01,1015,64,5.39,260,2021-12-02 12:00:00,winter
Seoul,Rain,10000,5.85,5.85,6.68,1014,57,5.13,255,2021-12-02 15:00:00,winter
Seoul,Rain,10000,4.6,4.6,4.6,1014,80,3.57,306,2021-12-02 18:00:00,winter
Seoul,Rain,10000,3.46,3.46,3.46,1016,47,4.49,317,2021-12-02 21:00:00,winter
Seoul,Clouds,10000,1.89,1.89,1.89,1019,41,5.28,314,2021-12-03 00:00:00,winter
Seoul,Clear,10000,2.48,2.48,2.48,1020,35,5.46,305,2021-12-03 03:00:00,winter


In [17]:
# Write cities_weather_df to `cities_weather_forecast.csv`
write.csv(cities_weather_df, "cities_weather_forecast.csv", row.names=FALSE)

## Download datasets as csv files from cloud storage


In [18]:
# Download several datasets

# Download some general city information such as name and locations
url <- "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0321EN-SkillsNetwork/labs/datasets/raw_worldcities.csv"
# download the file
download.file(url, destfile = "raw_worldcities.csv")

# Download a specific hourly Seoul bike sharing demand dataset
url <- "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0321EN-SkillsNetwork/labs/datasets/raw_seoul_bike_sharing.csv"
# download the file
download.file(url, destfile = "raw_seoul_bike_sharing.csv")