-
-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question regarding deployment on Heroku #38
Comments
Ah, yes, you need to add the buildpack through the heroku GUI. Just go to the settings page of your heroku project and there you should find the add buildpack option. THere you can drop in the |
also, you need to run |
(noticed that is actually a typo in the docs, just fixed it) |
Did you manage to get it to work? I opened an issue with dtreeviz here to make the xgboost dependency optional, so that in the future it should be less of a headache. |
Unfortunately, I'm not quite there. Building worked (after adding the Python buildpack as well 👼 ), but the app itself crashes due to
and the procfile |
Ah, you need to add a requirements.txt file with: explainerdashboard==0.2.13.2 |
Ok I forgot to mention the file, which was not contain gunicorn. Now I am facing an en-/decoding error:
|
Maybe it's because gunicorn is not designed for Windows and I should use waitress instead? |
Ah, then the python version on heroku and the one you used to pickle the explainer is probably not compatible (pickle does not guarantee unpickles across versions): You can learn about setting runtime versions here and here Basically you should add a runtime.txt file with the python version:
Supported versions are python-3.9.0, python-3.8.6, python-3.7.9 and python-3.6.12 |
Ah, yes. I don't know much about windows deployment (didnt even knew that heroku supported it to be honest :), but waitress is then probably the way to go. If you manage to get it to work could you let me know so that I can add instructions to the docs? |
I managed to track down every bug/mistake and got it running with gunicorn :-) For future reference, I'm using the following app.py:
Using .flask_server() is somehow not so smart. |
ah, really? That is strange, should be equivalent. Do you have a public link for the dashboard? Would be curious what you built. |
Uh, never mind, I guess I put it into the wrong spot. The following works fine as well.
The documentation looks great. Maybe you want to explain why you are using Sure, I will share it, but I have another problem: the plots are empty 😁 although I have loaded the data in the Github project. What could be the reason for this? Do you even need the data to be present when loading a dashboard from disk? |
Ah yeah: If the plots are empty, something is definitely broken: you would want to check the logs for the stacktrace. All the data should be contained in the |
I have the yaml and the explainer-file in the repo. Locally, loading works fine, but on Heroku, the plots are empty. I can't find anything in the logs:
|
Hmm, that is quite strange. Two potential reasons:
|
Group cats is disabled atm, but choosing/changing an index in FeatureInputComponent does not show/change anything as well. |
I tried loading the explainer file only, building the dashboard in app.py - no difference. I also tried constructing the dashboard it self in app.py, which is not working due to memory limitations on Heroku. My next & last idea is to deploy one of your Titanic dashboards ... |
Yeah, probably handy to first test something that you know should work: generate_dashboard.py: from explainerdashboard import ClassifierExplainer, ExplainerDashboard
from explainerdashboard.custom import *
explainer = ClassifierExplainer(model, X_test, y_test)
# building an ExplainerDashboard ensures that all necessary properties
# get calculated:
db = ExplainerDashboard(explainer, [ShapDependenceComposite, WhatIfComposite],
title='Awesome Dashboard', hide_whatifpdp=True)
# store both the explainer and the dashboard configuration:
explainer.dump("explainer.joblib")
db.to_yaml("dashboard.yaml") Now run dashboard.py: from explainerdashboard import ClassifierExplainer, ExplainerDashboard
explainer = ClassifierExplainer.from_file("explainer.joblib")
# you can override params during load from_config:
db = ExplainerDashboard.from_config(explainer, "dashboard.yaml", title="Awesomer Title")
app = db.flask_server() Now run If it works, push the repo and see if it works on heroku as well. |
Locally it works, but the result is https://hk-db-test.herokuapp.com/. :( |
Hmm, weird. Do you maybe have the dashboard linked to a public github repo so that I can fork it and see if I can get it to work? |
Yup, it's here: https://github.com/hkoppen/Dashboard_Test |
Found the problem (now just need to find the solution): (Btw, if you install the papertrail add-on (it's for free), you can see real time logs including stacktraces like below to help you debug)
So the callbacks are not working for some reason (hence why you don't see the graphs, nor does the random index button work)... I will have to investigate why, but it is very odd that it works for the titanicexplainer.herokuapp.com deployment but not for this one. Hmm. |
Got it: change
Not sure exactly what is so magical about |
I think I have an idea what is going one: https://docs.gunicorn.org/en/stable/settings.html:
In order to make sure that all dash elements are unique, I add a random Will add a clearer warning to the docs that |
Yup, that's it. Damn it, we talked about it exactly 7 days ago! Edit: Now I can move on to deploy the app via Docker ;-) |
I'm having the same error about callbacks not found. The --preload option solves for running from one cotainer with gunicorn, but when scaling with docker swarm the error shows again. |
Hi, it is still not working. All our dashboards are deployed by default via docker. It seems like it still assigns uuid names to callbacks. Could it have something to do with running the dashboard through gunicorn and wsgi? |
Is this the default dashboard or did you make your own custom dashboards? |
So for example, when I call:
I get the following output, where you can see that all the callback id's end with two digits ('10', '23', etc) instead of a uuid string of length 5. Do you see the same?
|
I am using the default dashboard, and just switch things off and on in the dashboard.yaml file. So when i do
i get the following output, which seems to have the uuid extension. |
Are you sure you're on the latest (pypi) version (0.2.17)? In case you're installing through conda: the conda version is a bit behind (0.2.15) because we're dealing with some conda-forge dependency conflicts, and that version has not yet has the |
Im quite sure it is version 0.2.17 we are deploying through docker
but we are also building a venv. So maybe that messes something up? |
Shouldn't mess things up, using virtual envs myself. Only thing I can think of right now is that you have a cached build step from the docker build that is still using 0.2.15. So could try to prune the cache and see if that helps. Will build a dashboard myself inside a docker container, and see if I run into the same issue. Do you have a reproducible example with Dockerfile that generates the error? (can just be with the titanic dataset) |
This seems to work fine, with no generate_dashboard.py from sklearn.ensemble import RandomForestClassifier
from explainerdashboard import *
from explainerdashboard.datasets import *
X_train, y_train, X_test, y_test = titanic_survive()
model = RandomForestClassifier(n_estimators=50, max_depth=5)
model.fit(X_train, y_train)
explainer = ClassifierExplainer(model, X_test, y_test,
cats=["Sex", 'Deck', 'Embarked'],
labels=['Not Survived', 'Survived'],
descriptions=feature_descriptions)
db = ExplainerDashboard(explainer)
db.to_yaml("dashboard.yaml", explainerfile="explainer.joblib", dump_explainer=True) run_dashboard.py import waitress
from explainerdashboard import *
db = ExplainerDashboard.from_config("dashboard.yaml")
print(list(db.app.callback_map.values()))
if __name__ == "__main__":
waitress.serve(db.app.server, host='0.0.0.0', port=9050) Dockerfile FROM python:3.8
RUN pip install explainerdashboard
COPY generate_dashboard.py ./
COPY run_dashboard.py ./
RUN python generate_dashboard.py
EXPOSE 9050
CMD ["python", "./run_dashboard.py"] $ docker build -t explainerdashboard .
$ docker run -p 9050:9050 explainerdashboard |
@moeller84 Did you manage to get it to work? |
Sorry. No i did not, unfortunatly. |
When i read the source code it seems like you are still generating uuids when name is None, but then just suffixing a number in the end. EDIT: when i recreate your example from above i dont get the generated uuids. |
Yes, each composite (base for a tab), simply adds a number to the end to class ImportancesComposite(ExplainerComponent):
def __init__(self, explainer, title="Feature Importances", name=None,
hide_importances=False,
hide_selector=True, **kwargs):
"""Overview tab of feature importances
Can show both permutation importances and mean absolute shap values.
Args:
explainer (Explainer): explainer object constructed with either
ClassifierExplainer() or RegressionExplainer()
title (str, optional): Title of tab or page. Defaults to
"Feature Importances".
name (str, optional): unique name to add to Component elements.
If None then random uuid is generated to make sure
it's unique. Defaults to None.
hide_importances (bool, optional): hide the ImportancesComponent
hide_selector (bool, optional): hide the post label selector.
Defaults to True.
"""
super().__init__(explainer, title, name)
self.importances = ImportancesComponent(
explainer, name=self.name+"0", hide_selector=hide_selector, **kwargs)
def layout(self):
return html.Div([
dbc.Row([
make_hideable(
dbc.Col([
self.importances.layout(),
]), hide=self.hide_importances),
], style=dict(margin=25))
]) Then self.tabs = [instantiate_component(tab, explainer, name=str(i+1), **kwargs) for i, tab in enumerate(tabs)] So each tab gets the name "1", "2", 3", etc. And then each subcomponent gets the name "11", "12", etc. Are you defining custom components? Or defining them before you add them to E.g.
Would result in a random |
Hi |
So in a swarm it starts generating uuid names but in a single container it doesn't? That seems super strange... Again, the only thing I can think of is old versions of explainerdashboard in a cached docker layer. |
I'm gonna see if I can build some diagnostic functionality that makes it easier to see the whole component tree, including |
it also does generate uuid names with a single container. But it seems that callback names are being mixed when running on more than one container |
ah, okay, that at least is an easier to understand problem. So the example I gave you didn't give uuid names right? Is there any code you can share on how you generate the dashboard? Because you have to be doing something custom otherwise it would just work out of the box. |
dashboard.yaml
|
Ah, I think I got it! In the yaml I see:
So that equates to The string tab indicators get converted by
def _convert_str_tabs(self, component):
if isinstance(component, str):
if component == 'importances':
return ImportancesTab
elif component == 'model_summary':
return ModelSummaryTab
elif component == 'contributions':
return ContributionsTab
elif component == 'whatif':
return WhatIfTab
elif component == 'shap_dependence':
return ShapDependenceTab
elif component == 'shap_interaction':
return ShapInteractionsTab
elif component == 'decision_trees':
return DecisionTreesTab
return component These dashboard:
explainerfile: data/processed/explainer.joblib
params:
title: Fastholdelses model
hide_header: false
hide_shapsummary: false
header_hide_title: false
header_hide_selector: false
block_selector_callbacks: false
pos_label: null
fluid: true
mode: dash
width: 1000
height: 800
external_stylesheets: null
# server: true
# url_base_pathname: null
responsive: true
logins: null
port: 8050
importances: false
model_summary: false
shap_interaction: false
decision_trees: false So this is equivalent of passing booleans to switch off tabs: |
Just released https://github.com/oegedijk/explainerdashboard/releases/tag/v0.2.20 which should fix this issue... |
I think you can also simplify the loading of the dashboard: def create_app(config):
logger.info("Starting app...")
logger.debug(f"Using config {config}")
app = Flask("for_p_afgang_dashboard")
app.config.from_object(config)
@app.route("/health")
def healthcheck():
return "Healthy", 200
setup_extensions(app)
explainerfile = str(file_path.parent.joinpath("data").joinpath("explainer.joblib"))
dashboard_yaml_path = file_path.parent.joinpath("dashboard.yaml")
logger.info(explainerfile)
dashboard = ExplainerDashboard.from_config(
explainerfile , dashboard_yaml_path, server=app, url_base_pathname="/")
logger.info(f"Explainer contains {len(dashboard.explainer)} samples")
print(list(dashboard.app.callback_map.values()))
@app.route("/")
def return_dashboard():
return dashboard.app.index()
logger.info("Explainer dashboard loaded")
return app |
i updated to the latest version and also altered the .yaml file. That leaves me with this error (having touched anything else):
|
ah, yeah, you have to rebuild the explainer with the new version: I made some breaking changes how categorical features and one hot encoded features are handled internally in order to support categorical features. (on the plus side: categorical features are supported now!) |
Is there a reason why you are using UUIDs in the first place? Thinking you could just set seed and do randomization with numbers to get deterministic names. E.g line 177 in dashboard_methods.py |
Original goal was to generate a unique name that is both short and url-friendly (planning on adding querystring support at some point). But I guess that could be done simpler and without the shortuuid dependency, e.g.: https://proinsias.github.io/til/Python-UUID-generate-random-but-reproducible-with-seed/ Got a code suggestion? |
Is it working now? Shall I close the issue? |
This seems to be working now! Ran with several workers on gunicorn and also saved callback id names which all matches. |
Awesome! |
I just tried to deploy my app on Heroku by directly importing the github project.
However, I did not manage to "add the buildpack" correctly - I'm still generating a slug larger than 500MB. I did
What am I doing wrong, do I have to add the buildpack somewhere in Heroku itself?
The text was updated successfully, but these errors were encountered: