New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable feeding the server configuration from environment variables #765
base: master
Are you sure you want to change the base?
Conversation
In the new yaml configuration, there is no more specific server config file, just a unique one. While this mechanism could work for the new yaml config, I am trying to have something more generic using Java reflection which does not require changing the yaml file and follow the same format as for the current property file (e.g. |
Ok I did not manage to find a generic solution. I think nothing reasonable can work with list. It would be probably better to forget about overriding config. parameter with environment variables? I think this can work only for trivial configuration. |
No problem. For the YAML de-serialisation perhaps using Jackson would solve some of the problems in the mapping. If it works as the JSON/XML mapper it should be rather simple to implement even with complex nested YAML config files... |
The problem is the mapping of environment variable (which are kind of linear serialization) to override the configuration class/object, which is nested with lists. |
The solution adopted from dropwizard (we could borrow the idea.. and the code) is to specify the variable directly in the configuration, so that the complexity is left out, without losing flexibility. |
Luca, how do you see the process with the new yaml config? It would be only user-driven? I don't think we can hack the yaml file for all the possible parameter values, because it has potentially much more than hundred of possible parameter keys now (each model has its own set of nested parameters). |
Yes, let's keep it simple. User can always define their own configuration and decide which variables can be read from the env. |
I've updated this PR. Now the users can select which information can be read from environment variables by using
E.g. for the grobid-home:
To keep it simple we can just leave the user to decide what to replace (or we can add some variables that can be useful e.g. for the docker container" |
When I tries to write the documentation for this, I realized that there is no clear way to get it working with docker, for example: docker run -t --rm --init -p 8080:8070 -p 8081:8071 \
--env GROBID_MODEL_PRELOAD=true \
--env GROBID_GROBID_HOME="/some/where/grobid-home" \
lfoppiano/grobid:0.7.0 ... because the The user would need not only its own definitions in the configuration file, but also rebuild a new image with the modified configuration file. |
Yes indeed. My idea would be that once you've set up your configuration file with the parameters, then you can easily run the docker by changing the parameters at the command line, instead of editing the file. |
Yes, for me the main use-case would also look like that. Mainly with docker / container (but not strictly). This to potentially consider:
(see also #480 - I had my previous implementation based on Airflow's config) |
Sorry if I was not clear, the problem I raise is the complexity of the solution:
The consequences are that the solution can never be compatible with the images on dockerhub and it requires some heavy preliminary work that finally makes it look more complicated that just specifying a config file as parameters at the command line. I initially started to document it in the Docker section, but this might need to go to the "developer notes" because the existing docker images cannot support this. |
I guess you could include a default config file with all of the placeholders in Docker Hub? That is why Airflow' config opted for a more general pattern. Perhaps a bit similar (although not environment variables), Helm install supports |
We can include some placeholder by default so that certain basic configurations can be modified from the docker image without supplying an additional config file. E.g. grobid home, number of threads, lazy/eager load of models... etc... Personally, for more advance usege, I would modify the config file and add placeholders where I need to, supply such values to the docker image and I will have the possibility to run a different configuration without modifying the configuration file. |
Following up #762. This PR enables the substitution of certain configuration item from environment variables.
In future it can be reused as long as the configuration is loaded from dropwizard.