Self Hosting Questions #11

goldbattle · 2023-01-01T03:11:48Z

Many thanks for the project and open sourcing your processing scripts.
I tried to dabble a bit myself before finding this project and only got as far as extracting and looking at the data in the grib files.
I was able to run the docker with your included scripts to download the data from s3 onto my machine.

Building

cd pirateweather/wgrib2/
docker build -t wgrib2 -f Dockerfile .
docker image list

Running

docker run --net=host \
    -v "D:\\WEATHER\\:/mnt" \
    -e bucket="noaa-hrrr-bdp-pds" \
    -e download_path="/mnt/data/efs1z" \
    -e temp_path="/mnt/data/tmp" \
    -e time="2022-12-30T15:45:00Z" \
    wgrib2 \
    /mnt/pirateweather/scripts/hrrrh_combined-fargate.py

I have a couple questions.

The timestamp input seems to determine the times that are downloaded, how do you normally specify this? The current time?
Do I need to run all scripts? or since I am in the US, can I just run the HRRR which I believe has most data (you note on the docs it doesn't seem to have UV, but that is ok for me currently).
If I need to run the other scripts, can this be done concurrently, or just sequentially?
The example run command puts files inside a efs1z folder. Is this a specific folder name? or does it have some meaning here? Should the other scripts be in a different folder?
I want to run the download on a cron job, could you explain how I should interpret this trigger table? Should I run HRRR every hour to get data, or just every 3 hours could suffice?

The one code I am trying to find is what maps the lat, lon to query these files. The docs say that this is a lambda function that does this and is relatively detailed on the process. Is this code public? If so could you point me in the right direction? Many thanks!

The text was updated successfully, but these errors were encountered:

alexander0042 · 2023-01-03T14:35:29Z

Hi,

Thanks for checking out this project, and I appreciate your detailed questions here! The "open" aspect of this project is really important to me, so I'm happy to see people digging into the source, but I know this side of things could be much (much!) clearer. I'll try to address things point by point here.

The "time" function is designed to be the current time as a string, using the format "%Y-%m-%dT%H:%M:%S%z". This is how AWS says when the function is run, and then the processing script does back the number of hours in that table to find the file.
Nope! I call this as 4 separate step functions, just changing the run command (like you've done!).
The docker image is designed to run one script at a time, but no reason you couldn't have multiple copies of the same image running.
The efs1z is just my internal AWS structure coming out, so you could store it anywhere. If you're curious, the name comes from storage on the EFS file system (which is an incredibly flexible tool to get data to Lambda), set to use 1 zone.
I sort of covered this in the first question, but to clarify, you want to run it on the "Ingest Times (UTC)" row and pass the current time to the container. So to run HRRR-Hourly (hrrrh), you'd set Cron to 2:30,8:30,14:30,20:30 and pass (using 2:30 as an example) "2023-01-01T02:30:00+0000".

With respect to the read script, you're right that it's not currently in this repository. There are two issues with it- it's an uncommented mess of me learning Python on the fly while building this, and relies on a ton of assumptions with respect to Lambda and AWS gateway. I think an easier solution is to ask what your ultimate goal is here and go at this from that direction, since with these scripts, all the data will be there. Something along the lines of this notebook is what I have in mind, since it's shows a python script to extract a data point time series from the NetCDF file.

goldbattle · 2023-01-03T16:10:14Z

Thanks for the response! I will revisit your answers when I get time on the weekend, but wanted to respond to your question about the processing / query scripts. I think what I am looking for is just a function that takes in a lat, lon, and returns the json structure with all the info filled out. I am not sure if it is easy to have this repo and what you use share code, but this could separate the platform specific code and the query code. Ideally, I want to try to create a small server that just calls this function so I can run things all on my local network, or do further processing. This is of course I was interested in contributing back.

…

On Tue, Jan 3, 2023 at 9:35 AM Alexander Rey ***@***.***> wrote: Hi, Thanks for checking out this project, and I appreciate your detailed questions here! The "open" aspect of this project is really important to me, so I'm happy to see people digging into the source, but I know this side of things could be much (much!) clearer. I'll try to address things point by point here. 1. The "time" function is designed to be the current time as a string, using the format "%Y-%m-%dT%H:%M:%S%z". This is how AWS says when the function is run, and then the processing script does back the number of hours in that table to find the file. 2. Nope! I call this as 4 separate step functions, just changing the run command (like you've done!). 3. The docker image is designed to run one script at a time, but no reason you couldn't have multiple copies of the same image running. 4. The efs1z is just my internal AWS structure coming out, so you could store it anywhere. If you're curious, the name comes from storage on the EFS file system (which is an incredibly flexible tool to get data to Lambda), set to use 1 zone. 5. I sort of covered this in the first question, but to clarify, you want to run it on the "Ingest Times (UTC)" row and pass the current time to the container. So to run HRRR-Hourly (hrrrh), you'd set Cron to 2:30,8:30,14:30,20:30 and pass (using 2:30 as an example) "2023-01-01T02:30:00+0000". With respect to the read script, you're right that it's not currently in this repository. There are two issues with it- it's an uncommented mess of me learning Python on the fly while building this, and relies on a ton of assumptions with respect to Lambda and AWS gateway. I think an easier solution is to ask what your ultimate goal is here and go at this from that direction, since with these scripts, all the data will be there. Something along the lines of this notebook: https://github.com/alexander0042/Pirate-Weather-SMSL/blob/main/Pirate_HRRR_SM_Notebook.ipynb is what I have in mind, since it's shows a python script to extract a data point time series from the NetCDF file. — Reply to this email directly, view it on GitHub <#11 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAQ6TYTDL22TONSMQX262VLWQQ2L3ANCNFSM6AAAAAATN3P4WU> . You are receiving this because you authored the thread.Message ID: ***@***.***>

SoulRaven · 2023-01-24T15:47:05Z

+1 the project is interesting and i will get a spin on this and integrate in a open source project with the API written in python. a ready to go solution with my roundbox project. Is work in progress also, but the idea is to integrate anything as fast as possible and ready to deploy.
Can you show more about what have you write in the backend for the api?

github-actions · 2023-07-12T00:16:34Z

There has been no activity on this issue for ninety days and unless you comment on the issue it will automatically close in seven days.

goldbattle · 2023-07-12T04:46:20Z

Please leave this open as the API to raw data scripts are still something not included in the open sourced code.

alexander0042 · 2023-07-13T13:26:36Z

Happy to leave this open for now, and it still is on the roadmap; however, the issue remains that everything is still very tightly integrated with AWS/ Lambda at the moment, so not usable outside of my specific environment. In order to speed up response times, I'm eventually migrating this to docker, so very doable down the line! I'll also caution that processing scripts download ~100GB/ day, so will require a pretty beefy internet connection to self host

fox91 · 2023-07-13T13:31:14Z

Self host on AWS is always an option 😉
Please release as is, we don't mind if it isn't optimized or if we can't run with one click.
Open source doesn't mean "run easily on your device with your custom config"...

lordbagel42 · 2023-09-14T06:25:58Z

That would be exactly my thinking, I dislike subscriptions because my internet isn't the most stable, I would much rather just donate some money and then run the server on my own stuff that way if I wish to do 50,000 API requests per month, I could. I personally want it for Home Assistant.

msft-jeelpatel · 2024-01-24T19:37:22Z

Hi, any detailed guide on how to self host this and run on your own machine?

alexander0042 · 2024-04-18T18:36:46Z

Posting this here since I think it fits with this discussion, but I'm looking into what license I should use for the open-source stuff. Currently, everything is licensed under Apache 2.0; however, since the V2.0 code is pretty well all new, there's an option to take another look at this. My goal here is to make it possible to self host and run the entire stack (which will require a pretty beefy computing setup, but within the realm of possibility), but also want to avoid what happened to Redis, and have some provider come along and replicate it all without contributing back to keep improving this project. Along these lines, I'm debating releasing V2 under AGPL, and curious what people think about this?

I know it's a pretty restrictive license; however, the current status quo is not having the source public at all, which certainly isn't ideal either! The flip side of this is that I think I'll have to add a contributor license to make it possible for commercial uses of the project possible with permission. Again, definitely not ideal, but in order for free version of this to keep running, the AWS bill has to be paid somehow, so this seems like the way. I'm envisioning an min,io sort of structure- not ideal, but a practical way to make this open while keeping the lights on for the project

lordbagel42 · 2024-04-18T18:40:50Z

Personally, I’m a member of SlimeVR. We dual-license under Apache and MIT.

I dislike the GPL license for how “poisonous” it is. However, it's better than nothing, and I will support the project with either.

cloneofghosts mentioned this issue Jul 6, 2023

Cleanup Repos #33

Closed

github-actions bot added the stale label Jul 12, 2023

github-actions bot removed the stale label Jul 13, 2023

alexander0042 added the keep label Jul 24, 2023

cloneofghosts pinned this issue Nov 7, 2023

cloneofghosts added this to the Pirate Weather 2.0 milestone Dec 12, 2023

cloneofghosts mentioned this issue Jan 25, 2024

Differnce between MerrySky and BriefSky #142

Open

3 tasks

cloneofghosts added the Needs Review label Jan 25, 2024

cloneofghosts mentioned this issue Mar 12, 2024

API Version 2, Beta testing notes, issues, and feedback #170

Closed

3 tasks

cloneofghosts mentioned this issue Apr 1, 2024

Remaining Bugs/Issues for the V2 Development API #180

Closed

11 tasks

cloneofghosts removed the Needs Review label Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Self Hosting Questions #11

Self Hosting Questions #11

goldbattle commented Jan 1, 2023

alexander0042 commented Jan 3, 2023 •

edited

goldbattle commented Jan 3, 2023 via email

SoulRaven commented Jan 24, 2023

github-actions bot commented Jul 12, 2023

goldbattle commented Jul 12, 2023

alexander0042 commented Jul 13, 2023

fox91 commented Jul 13, 2023

lordbagel42 commented Sep 14, 2023

msft-jeelpatel commented Jan 24, 2024

alexander0042 commented Apr 18, 2024

lordbagel42 commented Apr 18, 2024

Self Hosting Questions #11

Self Hosting Questions #11

Comments

goldbattle commented Jan 1, 2023

alexander0042 commented Jan 3, 2023 • edited

goldbattle commented Jan 3, 2023 via email

SoulRaven commented Jan 24, 2023

github-actions bot commented Jul 12, 2023

goldbattle commented Jul 12, 2023

alexander0042 commented Jul 13, 2023

fox91 commented Jul 13, 2023

lordbagel42 commented Sep 14, 2023

msft-jeelpatel commented Jan 24, 2024

alexander0042 commented Apr 18, 2024

lordbagel42 commented Apr 18, 2024

alexander0042 commented Jan 3, 2023 •

edited