Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I plot temperature and precipitation for 3 stations in Tanzania from 1957 to 2016? #18

Closed
tosinaregbs opened this issue Nov 25, 2016 · 17 comments
Assignees

Comments

@tosinaregbs
Copy link

Dear Sparks,
Thanks for the GSODR package tutorial provided.
Using the Global Summary of the Day data from NOAA, I want to plot temperature (mean daily minimum temperatures, mean daily maximum temperatures), precipitation (total precipitation, average precipitation) for three Stations in Tanzania from 1957 to 2016 (station IDs: Kigoma, Zanzibar, and Mtwara).
Then repeat the process for the Tanzania as a country.
I read the example for Toowoomba, Queenland for 2010 provided on github. I wanted to start with your example and then adapt the codes to solve my problems. However, I have not been able to do this. I also tried to replicate the exact examples as presented for Toowoomba, Queenland for 2010 but could not still do it.
See my attempts below with the message I get. I will appreciate any assistance with how to plot the variables described above for the three stations in Tanzania and for the Country Tanzania from 1957 to 2016. I am not proficient in the use of R.

Thanks .

kigoma
queensland

@adamhsparks
Copy link
Member

adamhsparks commented Nov 25, 2016

Hi @tosinaregbs

Here's what I suggest. Install the latest version of GSODR from GitHub and try this method.

It will give you a data frame object in R for all three stations and all years you're interested in. You can use that for plotting your data. I'm not clear on how exactly you're plotting it or I'd have given you an example of that too. Is it annual? All years by station? Daily by station for all years? Lines? Points? Bar graph?

# Install the latest version from GitHub
install.packages("devtools")
devtools::install_github("adamhsparks/GSODR")
library(GSODR)

# Fetch the latest station list from the FTP server
station_list <- get_station_list()

# Create a list of just the stations of interest
station_list <- station_list[station_list$STN_NAME == "KIGOMA" |
                               station_list$STN_NAME == "ZANZIBAR" |
                               station_list$STN_NAME == "MTWARA"]

station_list <- station_list$STNID

# Fetch the data for all station/year combinations and save in a data.frame called "TZA"
TZA <- get_GSOD(years = 1957:2016, station = station_list)

Or, if you wish to use the CRAN version note the error message. You've not supplied R with any place to write the .csv file for saving on your local disk. This should fix that. But unlike the latest version on GitHub, you'll need to reimport the CSV file back to R for plotting.

get_GSOD(years = 1957:2016, station = "638010-99999", path = "~/")
kigoma <- read.csv("~/GSOD-638010-99999-1957-to-2016.csv")

I suggest using the GitHub version as it has many updates and works more nicely especially in cases like this.

@adamhsparks adamhsparks self-assigned this Nov 25, 2016
@tosinaregbs
Copy link
Author

Dear Sparks,

Very grateful for the lines of code provided.

I could not download the packages successfully from Github. I get the message below,
![fromgithub](https://cloud.githubusercontent.com/assets/23707719/20680865/d410333e-b598-11e6-8167-70b57d6bce04.

Please what do you advice?

@tosinaregbs
Copy link
Author

githhub2

@adamhsparks
Copy link
Member

Your screenshot indicates that the "DBI" package is not installed.

Try installing it first. Rinse, repeat with these error messages. Not sure why it wasn't installed properly with the other dependencies. It's not part of the dependencies listed for GSODR, it must be a dependency of one of the other packages.

@tosinaregbs
Copy link
Author

I want to plot all years by station as line graphs for Temperature, all year by station as bar chat for precipitation. Lastly the All year by country: Tanzania, temperature as a line graph and precipitation as bar chat

I have successfully installed GSODR from Github after I installed DBI as recommended.

The Last few lines shows the message I get when I use get_station_list()

Thanks for your time and assistance.

station list

@adamhsparks
Copy link
Member

adamhsparks commented Nov 28, 2016

You've reinstalled the CRAN version of GSODR, v0.1.9.

I suggest using the GitHub version, v1.0.0, it is the one that has the get_station_list function in it.

There was no need to reinstall GSODR at all. You only need to install the packages that weren't present, e.g. DBI, as you had already installed GSODR from GitHub.

@adamhsparks
Copy link
Member

If you're confused, to install the GitHub version again, just type:

# Install the latest version from GitHub
install.packages("devtools")
devtools::install_github("adamhsparks/GSODR")
library(GSODR)

@tosinaregbs
Copy link
Author

tosinaregbs commented Nov 29, 2016

I used two computers, one with the github version and the other with the CRAN version and I could not use the get_station_list() on both. Please what argument should I put in the function? I tried putting the FTP which I did not do correctly. I also tried some other things, See the image above. Give an example
of how to use the get_station_list() based on what I am trying to do.

Thanks for your time

@adamhsparks
Copy link
Member

adamhsparks commented Nov 29, 2016

Not trying to be redundant, but I did give you an example of using both the GitHub version and CRAN versions.

The older CRAN version does not contain the get_station_list function. There is no argument to be used in it as I show above and here below.

# Install the latest version from GitHub
install.packages("devtools")
devtools::install_github("adamhsparks/GSODR")
library(GSODR)

# Fetch the latest station list from the FTP server
station_list <- get_station_list()

station_list
#         USAF  WBAN            STN_NAME CTRY STATE CALL    LAT      LON ELEV_M    BEGIN      END
#     1: 008268 99999           WXPOD8278   AF    NA   NA 32.950   65.567 1156.7 20100519 20120323
#     2: 010010 99999 JAN MAYEN(NOR-NAVY)   NO    NA ENJA 70.933   -8.667    9.0 19310101 20161124
#     3: 010014 99999          SORSTOKKEN   NO    NA ENSO 59.792    5.341   48.8 19861120 20161124
#     4: 010015 99999          BRINGELAND   NO    NA   NA 61.383    5.867  327.0 19870117 20111020
#     5: 010016 99999         RORVIK/RYUM   NO    NA   NA 64.850   11.233   14.0 19870116 19910806
#   ---                                                                                          
# 28318: 999999 94996       LINCOLN 11 SW   US    NE   NA 40.695  -96.854  418.2 20020114 20161125
# 28319: 999999 96404           TOK 70 SE   US    AK   NA 62.737 -141.208  609.6 20110924 20161122
# 28320: 999999 96406         RUBY 44 ESE   US    AK   NA 64.502 -154.130   78.9 20140828 20161125
# 28321: 999999 96407        SELAWIK 28 E   US    AK   NA 66.562 -159.004    6.7 20150813 20161125
# 28322: 999999 96408         DENALI 27 N   US    AK   NA 63.452 -150.875  678.2 20150819 20161125
#              STNID ELEV_M_SRTM_90m
#     1: 008268-99999            1160
#     2: 010010-99999              NA
#     3: 010014-99999              48
#     4: 010015-99999              NA
#     5: 010016-99999              NA
#   ---                             
# 28318: 999999-94996             416
# 28319: 999999-96404              NA
# 28320: 999999-96406              NA
# 28321: 999999-96407              NA
# 28322: 999999-96408              NA

# Create a list of just the stations of interest
station_list <- station_list[station_list$STN_NAME == "KIGOMA" |
                               station_list$STN_NAME == "ZANZIBAR" |
                               station_list$STN_NAME == "MTWARA"]

station_list <- station_list$STNID

# Fetch the data for all station/year combinations and save in a data.frame called "TZA"
TZA <- get_GSOD(years = 1957:2016, station = station_list)

I tested this before posting it, it works.

Without any error messages I'm afraid I can't do much more to help here. Sorry.

@tosinaregbs
Copy link
Author

I am sorry for all inconveniences, I don't intend to be a nuisance . I am not skilled at the use of R.

I have reinstalled my R and followed your previous instruction to use the Github version and I have succeeded to get the data for the three locations. I have also successfully used the get_GSOD() to download all data for Tanzania.

The last thing I need to do is to plot a line chart with Temp against the years (annual) for each station and then the country. To plot data for the locations, I tried using plot(TZA$Year, TZA$Temp) but the chart did not look to too meaningful.

Kindly pardon my short-comings.

@adamhsparks
Copy link
Member

adamhsparks commented Nov 30, 2016

OK. We're getting there. :)

Try this to plot the stations' mean annual temperature.

install.packages(c("ggplot2", "dplyr"), dep = TRUE)

library(GSODR)
library(ggplot2)
library(dplyr)

# Fetch the latest station list from the FTP server
station_list <- get_station_list()

# Create a list of just the stations of interest
station_list <- station_list[station_list$STN_NAME == "KIGOMA" |
                               station_list$STN_NAME == "ZANZIBAR" |
                               station_list$STN_NAME == "MTWARA"]

station_list <- station_list$STNID

# Fetch the data for all station/year combinations and save in a data.frame called "TZA"
TZA <- get_GSOD(years = 1957:2016, station = station_list)

# Summarise the data by mean annual temperature for each station.
#  These functions are from the dplyr package.

TZA <-
  TZA %>%
  group_by(STN_NAME, YEAR) %>%
  summarise(ANNUAL_TEMP = mean(TEMP))

# Plot the data using ggplot2
# geom_line makes the lines
# geom_point makes the points on top of the lines
# the group = 1 in the ggplot command tells the plot to draw the lines together as one
# the "scale_colour_discrete" option changes the legend name to the right of the plot
# removing the "facet_grid()" option will produce stations all in a single plot overlaid

ggplot(data = TZA, aes(x = YEAR, y = ANNUAL_TEMP, group = 1)) +
  geom_line(aes(colour = STN_NAME)) +
  geom_point(aes(colour = STN_NAME)) +
  scale_colour_discrete(name  = "Station Name") +
  ylab("Mean Annual Temperature (˚C)") +
  facet_grid(STN_NAME ~ .)

rplot02

Note that there is a stretch of missing years in the 1960s (no points plotted)...

@tosinaregbs
Copy link
Author

Thanks so much I have been able to plot the temperature. I think I didn't get something well. The years are not clear.

temp imagetza2

I have also tried to edit the code provided to plot precipitation. see attempts below

precpitation attempts

I tried downloading country data for Tanzania but it ends with the message below. What do you suggest because I also need to plot temperature and precipitation for the whole country?

all tanzania2
ending of the tanzania 1957to2016 data.

@adamhsparks
Copy link
Member

adamhsparks commented Dec 2, 2016

I've updated station list script with corrections for the x-axis and plot for precipitation. The years overlapping is my fault, I left out a small step.

However, I'm sorry, I'm not sure what was wrong with your precipitation plot efforts. They look correct to me and seem to work for me, see below.

library(GSODR)
library(ggplot2)
library(dplyr)

# Fetch the latest station list from the FTP server
station_list <- get_station_list()

# Create a list of just the stations of interest
station_list <- station_list[station_list$STN_NAME == "KIGOMA" |
                               station_list$STN_NAME == "ZANZIBAR" |
                               station_list$STN_NAME == "MTWARA"]

station_list <- station_list$STNID

# Fetch the data for all station/year combinations and save in a data.frame called "TZA"
TZA <- get_GSOD(years = 1957:2016, station = station_list)

# Summarise the data by mean annual temperature for each station.
#  These functions are from the dplyr package.

TZA$YEAR <- as.numeric(TZA$YEAR) # this fixes the x-axis overlap issue
TZA_TMP <-
  TZA %>%
  group_by(STN_NAME, YEAR) %>%
  summarise(ANNUAL_TEMP = mean(TEMP))

# Plot the data using ggplot2
# geom_line makes the lines
# geom_point makes the points on top of the lines
# the group = 1 in the ggplot command tells the plot to draw the lines together as one
# the "scale_colour_discrete" option changes the legend name to the right of the plot
# removing the "facet_grid()" option will produce stations all in a single plot overlaid

ggplot(data = TZA_TMP, aes(x = YEAR, y = ANNUAL_TEMP, group = 1)) +
  geom_line(aes(colour = STN_NAME)) +
  geom_point(aes(colour = STN_NAME)) +
  scale_colour_discrete(name  = "Station Name") +
  ylab("Mean Annual Temperature (˚C)") +
  facet_grid(STN_NAME ~ .)

# PRECIPITATION
# Use bar charts
# scale_colour/scale_fill both specified so that there aren't funny lines around bars

TZA_PRCP <-
  TZA %>%
  group_by(STN_NAME, YEAR) %>%
  summarise(ANNUAL_PRCP = mean(PRCP))

ggplot(data = TZA_PRCP, aes(x = YEAR, y = ANNUAL_PRCP)) +
  geom_bar(aes(colour = STN_NAME, fill = STN_NAME), stat = "identity") +
  scale_colour_discrete(name  = "Station Name") +
  scale_fill_discrete(name  = "Station Name") +
  ylab("Mean Annual Precipitation (mm)") +
  facet_grid(STN_NAME ~ .)

rplot

@adamhsparks
Copy link
Member

adamhsparks commented Dec 2, 2016

Since your error message indicated an issue in the 1957 year files, I just tried downloading only the 1957 Annual file and creating a data frame in R only for Tanzania.

all_tanzania <- get_GSOD(years = 1957, country = "Tanzania")

# trying URL 'ftp://ftp.ncdc.noaa.gov/pub/data/gsod/1957/gsod_1957.tar'
# Content type 'unknown' length 21739520 bytes (20.7 MB) 
# ==================================================
# downloaded 20.7 MB

# Starting data file processing
   #|===================================================================================================| 100%

head(all_tanzania)
#    WBAN        STNID STN_NAME CTRY STATE CALL    LAT    LON ELEV_M ELEV_M_SRTM_90m    BEGIN      END
# 1 99999 637560-99999   MWANZA   TZ  <NA> HTMW -2.444 32.933   1147            1150 19500108 20161129
# 2 99999 637560-99999   MWANZA   TZ  <NA> HTMW -2.444 32.933   1147            1150 19500108 20161129
# 3 99999 637560-99999   MWANZA   TZ  <NA> HTMW -2.444 32.933   1147            1150 19500108 20161129
# 4 99999 637560-99999   MWANZA   TZ  <NA> HTMW -2.444 32.933   1147            1150 19500108 20161129
# 5 99999 637560-99999   MWANZA   TZ  <NA> HTMW -2.444 32.933   1147            1150 19500108 20161129
# 6 99999 637560-99999   MWANZA   TZ  <NA> HTMW -2.444 32.933   1147            1150 19500108 20161129
#   YEARMODA YEAR MONTH DAY YDAY TEMP TEMP_CNT DEWP DEWP_CNT SLP SLP_CNT   STP # STP_CNT VISIB VISIB_CNT WDSP
# 1 19570101 1957    01  01    1 22.8        5 16.8        5  NA       0 887.3       5  46.0         5  1.8
# 2 19570102 1957    01  02    2 24.6        5 15.4        5  NA       0 885.2       5  56.0         5  2.6
# 3 19570103 1957    01  03    3 23.1        5 18.2        5  NA       0 884.8       5  41.2         5  1.5
# 4 19570104 1957    01  04    4 24.3        5 16.2        5  NA       0 885.5       5  51.0         5  3.2
# 5 19570105 1957    01  05    5 21.8        5 17.8        5  NA       0 887.0       5  40.9         5  3.3
# 6 19570106 1957    01  06    6 22.7        5 16.6        5  NA       0 887.0       5  38.0         5  4.2
#   WDSP_CNT MXSPD GUST   MAX MAX_FLAG   MIN MIN_FLAG PRCP PRCP_FLAG SNDP I_FOG  I_RAIN_DRIZZLE I_SNOW_ICE
# 1        5   6.2   NA 26.11        * 17.78        *    0         I   NA     0              0          0
# 2        5   6.2   NA 27.78        * 20.00        *    0         I   NA     0              0          0
# 3        5   6.7   NA 25.61        * 20.00        *    0         I   NA     0              0          0
# 4        5   8.2   NA 27.22        * 20.00        *    0         I   NA     0              0          0
# 5        5   8.7   NA 23.89        * 20.00        *    0         I   NA     0              0          0
# 6        5  10.3   NA 25.61        * 19.39        *    0         I   NA     0              0          0
#   I_HAIL I_THUNDER I_TORNADO_FUNNEL  EA  ES   RH
# 1      0         0                0 1.9 2.8 67.9
# 2      0         0                0 1.7 3.1 54.8
# 3      0         0                0 2.1 2.8 75.0
# 4      0         0                0 1.8 3.0 60.0
# 5      0         0                0 2.0 2.6 76.9
# 6      0         0                0 1.9 2.8 67.9

I think there was perhaps an issue with the download being corrupted? It seems to work fine for me. Retrying is all I can think of here. Again, I'm sorry, I'm really not sure what the issue is.

@lukuxus
Copy link

lukuxus commented Dec 3, 2016

Dear Adamh
What should be the code if we want to plot an ombrothermic diagram (combining TEMP and PRCP data) like this https://www.researchgate.net/profile/Andreas_Matzarakis/publication/233759040/figure/fig1/AS:300019598020608@1448541654256/Fig-1-Ombrothermic-diagram-at-Heraklion-for-the-climate-period-1961-1990.png

@adamhsparks
Copy link
Member

adamhsparks commented Dec 3, 2016

@lukuxus, that's not possible using ggplot2 or recommended for many reasons, see Hadley's response here: http://stackoverflow.com/questions/3099219/plot-with-2-y-axes-one-y-axis-on-the-left-and-another-y-axis-on-the-right).

If it's what you wish to do, have a look at this tutorial that I wrote. It illustrates how to use base graphics and plot a bar graph with a line graph over the top.

http://www.apsnet.org/edcenter/advanced/topics/EcologyAndEpidemiologyInR/DiseaseProgress/Pages/StingNematode.aspx

@adamhsparks
Copy link
Member

Since this has not had any activity in the past week and a half, I'm closing this now. If you have more questions feel free to reopen it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants