Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

robust=True estimation doesn't work on Ubuntu? #1253

Open
konradsemsch opened this issue Apr 12, 2021 · 4 comments
Open

robust=True estimation doesn't work on Ubuntu? #1253

konradsemsch opened this issue Apr 12, 2021 · 4 comments

Comments

@konradsemsch
Copy link

Hi!

I've been using the LogLogisticAFTFitter with robust=True estimation on my Mac and I could see during during the procedure the following message coming from the logger that is attributes to this settings (at least I presume so, as I haven't seen it before without it):

2021-04-12 09:13:35,766 | INFO | utils.py | _init_num_threads | NumExpr defaulting to 8 threads.

also, the result of print_summary() clearly indicates that the SE show now different values as expected.

However, after packaging this code into a Docker image (Ubuntu), and running the container, I couldn't see the same logger output, and the SE were as if robust estimation did not take place.

Question: could it be that this setting silently fails on a different OS? Could you provide a bit more insights into this?

@CamDavidsonPilon
Copy link
Owner

Hi @konradsemsch - hm, I don't know what's going on here. I'm guess the difference between the Ubuntu and Mac is if numexpr is installed (lots of libs, like pandas, have an optional dependency on numexpr, and will use it if available in the OS). Can you try pip intstalling numexpr in the Docker image and rerunning? That would give me some hint as to what might be going on.

@konradsemsch
Copy link
Author

Hi @CamDavidsonPilon! Ok, so I can come back with some further info. Adding numexpr to requirements indeed resolved the issue, but only partially.

  1. Now, during training in the container I see the following lines:

image

  1. When I run the model.print_summary() function locally on my mac, I also get a similar note and the SE seem reasonable:

image

  1. On the other hand, the same print_summary() function executed in the container in order to render online documentation doesn't provide the same results. As if the numexpr package wasn't used even though the same container with the same dependencies is applied.

Please note, that all comparisons were done on exactly the same trained model object, just the OS was different.

@CamDavidsonPilon
Copy link
Owner

Are you able to share the two print_summary results here?

@konradsemsch
Copy link
Author

Hi @CamDavidsonPilon! Yes, let me report back on this:

So from the logs I can confirm that numexpr is activated in the backend when the model is trained in the container. When I download it and inspect the results with print_summary() locally I see the following:

Screenshot 2021-04-16 at 12 34 42

So from here you can already see that numexpr kicked in when the function was called.

And here's part of the summary which is executed on the container by sphinx, on the very same model object. Please notice that robust variance doesn't even show up in the top level summary. Please note that it's exactly the same object in the same jupyter notebook.

Screenshot 2021-04-16 at 12 31 17

I would also expect to see the same numexpr message when calling this function, but it doesn't show up. However, what's surprising is that it I can see it being used when invoking some other function using the model object:

image

Any clue what could be causing this? Would there be any way to enforce a consistent behaviour?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants