Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MetNet-3 #54

Open
jacobbieker opened this issue Jun 14, 2023 · 9 comments
Open

Add MetNet-3 #54

jacobbieker opened this issue Jun 14, 2023 · 9 comments
Assignees
Labels
enhancement New feature or request

Comments

@jacobbieker
Copy link
Member

Arxiv/Blog/Paper Link

https://arxiv.org/pdf/2306.06079v2.pdf

Detailed Description

MetNet-3 was released as a paper

Context

It would be better to have all versions of MetNet in this repo. This densified forecast could also be useful for the irradiance modelling, going from PV sites to a dense forecast.

@jacobbieker jacobbieker added the enhancement New feature or request label Jun 14, 2023
@jacobbieker jacobbieker self-assigned this Jun 14, 2023
@jacobbieker
Copy link
Member Author

They use dense and sparse inputs, as well as outputs.

  • Use HRRR output to help with training, but don't actually look at the predictions
  • Uses very large center context of 2500km, extra large area of 5000km, and forecasts for 24 hours
  • Masks out 25% of sites per example, to help with densification

@jacobbieker
Copy link
Member Author

More notes:

  • They use 942 weather stations, and train by assigning the values of the weather observations to the 4x4km pixel in which it lies. If there are multiple weather stations in a single pixel, they average the values.
  • Evaluation they don't give any past data from the eval weather stations, so its the same as any other grid point to compare against. They then also give eval weather station history to give hyperlocal forecasts which are a decent improvement (could be very useful for site level forecasts)
  • Inputs include a topographical embedding, instead og directly giving topo map or land/sea mask: grid of 4km stride, with 20 parameters per grid point: "For each input example, wecalculate the topographical embedding of each input pixel center by bilinearly interpolating the embeddingsfrom the grid. The embedding parameters are trained together with other model parameters similarly toembeddings used in NLP."
    image

Architecture:
image

"Data is then processed by a U-Net backbone, which starts with applying two convolutional ResNetblocks [9] and downsampling the data to 8 km resolution. We then pad the internal representation spatiallywith zeros to 4992 km by 4992 km square and concatenate with the low-resolution, large-context inputs.Afterward, we again apply two convolutional ResNet blocks and downsample the representation to 16 kmresolution. Convolutional ResNet blocks can only handle local interactions and for longer lead times closeto 24 hours, the targets may depend on the entire input. In order to facilitate that, we process the dataat 16 km resolution using a modified version of MaxVit [22] network. MaxVit is a version of Vision Trans-former (ViT, [6]) with attention over local neighbourhood as well as global gridded attention. We modifythe MaxVit architecture by removing all MLP sub-blocks, adding skip connections (to the MaxVit output)after each MaxVit sub-block, and using normalized keys and queries in attention [5].Afterwards, we take the central crop of size 768 km by 768 km, and gradually upsample the representationto 4 km resolution using skip connections from the downsampling path, at which point we again take acentral crop, this time of size 512 km by 512 km. The network outputs a categorical distribution over 256bins for each of 6 ground weather variables and a deterministic prediction for each of 617 assimilated weatherstate channels using an MLP with one hidden layer applied to the representation at 4 km resolution. Forprecipitation (both instantaneous rate and hourly accumulation), we upsample the representation to 1 kmresolution and output for each pixel a categorical distribution over 512 bins. "

  • Lead time is included by applying time embedding both additively and multiplicitvely to blocks, same as MetNet-2
  • Forecast lead time for training isn't same across lead times, it follows an exponential drop off, with t=0 having 10 times the probability of being shown vs t=24h
  • Trained on cross-entropy loss, after rescaling losses to be similar magnitudes. MSE for forecast on HRRR assimilation state, although those predictions weren't looked at, they just helped training

Author Notes:

  • Tradeoff in performance for precipitation forecast vs ground variables, improving one resulted in decreasing performance for the other
  • To work with this, trained primarily percipitation model, then "afterwards we increase the weight of the OMO loss by100x compared to the precipitation model and finetune the model. Moreover, we disable topographicalembedding (fix them to zeros) for this OMO-specific model because topographical embedding may hindertransfer between different locations, which is crucial for learning only from targets present at a sparse set of locations."
  • Loss scaling
    image

@jacobbieker
Copy link
Member Author

ASOS 1 minute weather data (public and freely accessible): https://madis.ncep.noaa.gov/madis_OMO.shtml

@jacobbieker
Copy link
Member Author

Also, they mention that MetNet-3 is being used for operational forecasts in Google Search already

@JackKelly
Copy link
Member

JackKelly commented Jun 15, 2023

Sounds great! Well done for spotting this publication!

MetNet-3 uses a modified MaxViT model in the centre of the U-Net. Here's the MaxViT paper. The MaxViT authors have also released TensorFlow code. But, TBH, MaxViT sounds so simple that it's probably easier to re-implement MaxViT in PyTorch directly from the MaxViT paper 🙂

@jacobbieker
Copy link
Member Author

Yeah, timm also has an implementation of MaxViT as well in Pytorch, we could either use or base ours off of it

@jacobbieker
Copy link
Member Author

Found a website that has weather station data for the whole world, and easily downloadable, including UK, and other countries https://github.com/akrherz/iem/blob/main/scripts/asos/iem_scraper_example.py from https://mesonet.agron.iastate.edu/request/download.phtml?network=GB__ASOS

@Raahul-Singh
Copy link
Collaborator

Found an implementation of Metnet 3: https://github.com/lucidrains/metnet3-pytorch

@meteoDaniel
Copy link

Here is another implementation, already finished:
https://github.com/kyegomez/metnet3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants