New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Self Hosting Questions #11
Comments
Hi, Thanks for checking out this project, and I appreciate your detailed questions here! The "open" aspect of this project is really important to me, so I'm happy to see people digging into the source, but I know this side of things could be much (much!) clearer. I'll try to address things point by point here.
With respect to the read script, you're right that it's not currently in this repository. There are two issues with it- it's an uncommented mess of me learning Python on the fly while building this, and relies on a ton of assumptions with respect to Lambda and AWS gateway. I think an easier solution is to ask what your ultimate goal is here and go at this from that direction, since with these scripts, all the data will be there. Something along the lines of this notebook is what I have in mind, since it's shows a python script to extract a data point time series from the NetCDF file. |
Thanks for the response! I will revisit your answers when I get time on the
weekend, but wanted to respond to your question about the processing /
query scripts.
I think what I am looking for is just a function that takes in a lat, lon,
and returns the json structure with all the info filled out.
I am not sure if it is easy to have this repo and what you use share code,
but this could separate the platform specific code and the query code.
Ideally, I want to try to create a small server that just calls this
function so I can run things all on my local network, or do further
processing.
This is of course I was interested in contributing back.
…On Tue, Jan 3, 2023 at 9:35 AM Alexander Rey ***@***.***> wrote:
Hi,
Thanks for checking out this project, and I appreciate your detailed
questions here! The "open" aspect of this project is really important to
me, so I'm happy to see people digging into the source, but I know this
side of things could be much (much!) clearer. I'll try to address things
point by point here.
1. The "time" function is designed to be the current time as a string,
using the format "%Y-%m-%dT%H:%M:%S%z". This is how AWS says when the
function is run, and then the processing script does back the number of
hours in that table to find the file.
2. Nope! I call this as 4 separate step functions, just changing the
run command (like you've done!).
3. The docker image is designed to run one script at a time, but no
reason you couldn't have multiple copies of the same image running.
4. The efs1z is just my internal AWS structure coming out, so you
could store it anywhere. If you're curious, the name comes from storage on
the EFS file system (which is an incredibly flexible tool to get data to
Lambda), set to use 1 zone.
5. I sort of covered this in the first question, but to clarify, you
want to run it on the "Ingest Times (UTC)" row and pass the current time to
the container. So to run HRRR-Hourly (hrrrh), you'd set Cron to
2:30,8:30,14:30,20:30 and pass (using 2:30 as an example)
"2023-01-01T02:30:00+0000".
With respect to the read script, you're right that it's not currently in
this repository. There are two issues with it- it's an uncommented mess of
me learning Python on the fly while building this, and relies on a ton of
assumptions with respect to Lambda and AWS gateway. I think an easier
solution is to ask what your ultimate goal is here and go at this from that
direction, since with these scripts, all the data will be there. Something
along the lines of this notebook:
https://github.com/alexander0042/Pirate-Weather-SMSL/blob/main/Pirate_HRRR_SM_Notebook.ipynb
is what I have in mind, since it's shows a python script to extract a data
point time series from the NetCDF file.
—
Reply to this email directly, view it on GitHub
<#11 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAQ6TYTDL22TONSMQX262VLWQQ2L3ANCNFSM6AAAAAATN3P4WU>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
+1 the project is interesting and i will get a spin on this and integrate in a open source project with the API written in python. a ready to go solution with my roundbox project. Is work in progress also, but the idea is to integrate anything as fast as possible and ready to deploy. |
There has been no activity on this issue for ninety days and unless you comment on the issue it will automatically close in seven days. |
Please leave this open as the API to raw data scripts are still something not included in the open sourced code. |
Happy to leave this open for now, and it still is on the roadmap; however, the issue remains that everything is still very tightly integrated with AWS/ Lambda at the moment, so not usable outside of my specific environment. In order to speed up response times, I'm eventually migrating this to docker, so very doable down the line! I'll also caution that processing scripts download ~100GB/ day, so will require a pretty beefy internet connection to self host |
Self host on AWS is always an option 😉 |
That would be exactly my thinking, I dislike subscriptions because my internet isn't the most stable, I would much rather just donate some money and then run the server on my own stuff that way if I wish to do 50,000 API requests per month, I could. I personally want it for Home Assistant. |
Hi, any detailed guide on how to self host this and run on your own machine? |
Posting this here since I think it fits with this discussion, but I'm looking into what license I should use for the open-source stuff. Currently, everything is licensed under Apache 2.0; however, since the V2.0 code is pretty well all new, there's an option to take another look at this. My goal here is to make it possible to self host and run the entire stack (which will require a pretty beefy computing setup, but within the realm of possibility), but also want to avoid what happened to Redis, and have some provider come along and replicate it all without contributing back to keep improving this project. Along these lines, I'm debating releasing V2 under AGPL, and curious what people think about this? I know it's a pretty restrictive license; however, the current status quo is not having the source public at all, which certainly isn't ideal either! The flip side of this is that I think I'll have to add a contributor license to make it possible for commercial uses of the project possible with permission. Again, definitely not ideal, but in order for free version of this to keep running, the AWS bill has to be paid somehow, so this seems like the way. I'm envisioning an min,io sort of structure- not ideal, but a practical way to make this open while keeping the lights on for the project |
Personally, I’m a member of SlimeVR. We dual-license under Apache and MIT. I dislike the GPL license for how “poisonous” it is. However, it's better than nothing, and I will support the project with either. |
Many thanks for the project and open sourcing your processing scripts.
I tried to dabble a bit myself before finding this project and only got as far as extracting and looking at the data in the grib files.
I was able to run the docker with your included scripts to download the data from s3 onto my machine.
Building
Running
I have a couple questions.
efs1z
folder. Is this a specific folder name? or does it have some meaning here? Should the other scripts be in a different folder?The one code I am trying to find is what maps the lat, lon to query these files. The docs say that this is a lambda function that does this and is relatively detailed on the process. Is this code public? If so could you point me in the right direction? Many thanks!
The text was updated successfully, but these errors were encountered: