Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add normals data #38

Closed
steffilazerte opened this issue Oct 10, 2017 · 6 comments
Closed

Add normals data #38

steffilazerte opened this issue Oct 10, 2017 · 6 comments

Comments

@steffilazerte
Copy link
Member

steffilazerte commented Oct 10, 2017

  • Add climate normals data base
  • Add ability to update climate normals data frame
  • Add function to extract climate normals for a given station? (would pretty much just be a dplyr filter wrapper)
  • ftp://client_climate@ftp.tor.ec.gc.ca/Pub/Normals/English/

OR

Implement a normals calculation

  • Definitely would be faster and would require less bundled data
  • Possibly more prone to errors?
@boshek
Copy link
Collaborator

boshek commented Oct 10, 2017

I tend to think that using the ECCC values makes the most sense because decisions in summarizing the data may have been made that we will be unaware off. The opens up a situation where we could have different data (if I am understanding this correctly). On the other hand though, since the climate normals are not current, we may be able to update them. It also may be more to maintain if we create a normals function.

Is there any reason that we wouldn't simply point to the csv files:

ftp://ftp.tor.ec.gc.ca/Pub/Normals/English/English_CSV_files/

@steffilazerte
Copy link
Member Author

We totally can, the only drawback is that the files bundle stations data so we'll have to download the correct file, then extract the data (and the files are generally 5-8mb). It's not the end of the world, but would be slow... Alternatively, we could bundle the data in weathercan as an internal dataset, but that would be a large file to have. So as I see it we have three options:

  1. Function to access and extract station normals from ftp site as needed
    Pro: Simple and up-to-date
    Con: Slow

  2. Function to acces and extract station normals from locally store data as needed
    Pro: Relatively simple, fast
    Con: May get out of date, large data set to store

  3. We calculate normals based on ftp://client_climate@ftp.tor.ec.gc.ca/Pub/Normals/English/Calculation_of_the_1981_to_2010_Climate_Normals_for_Canada.doc
    And hope that we can recreate the values
    Pro: Fast, up-to-date
    Con: May not match ECCC values if there's something else going on

@boshek
Copy link
Collaborator

boshek commented Oct 10, 2017

Thinking about this a little more I think option 3 is probably the best. I think the issue is that it is also the most work. I hadn't totally understood the .csv files correctly so thanks for explaining that.

@steffilazerte
Copy link
Member Author

Yeah, it's really too bad about .csv files because that would definitely be the best way to go about it. I agree that option 3 is probably the most work :) we'll see how much time I have!

@steffilazerte
Copy link
Member Author

Actually there's also option 4) Scrape the data from the website: http://climate.weather.gc.ca/climate_normals/results_1981_2010_e.html?searchType=stnName&txtStationName=brandon&searchMethod=contains&txtCentralLatMin=0&txtCentralLatSec=0&txtCentralLongMin=0&txtCentralLongSec=0&stnID=3471&dispBack=0

But this is prone to errors and may be susceptible to small website changes.

@steffilazerte
Copy link
Member Author

@steffilazerte steffilazerte added this to To do in weathercan v1.0.0 via automation Jun 10, 2019
@steffilazerte steffilazerte moved this from To do to In progress in weathercan v1.0.0 Jun 10, 2019
@steffilazerte steffilazerte mentioned this issue Sep 24, 2019
@steffilazerte steffilazerte moved this from In progress to Done in weathercan v1.0.0 Sep 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

No branches or pull requests

2 participants