Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to generate notebook using format JUPYTER: {"error_code": "MAX_NOTEBOOK_SIZE_EXCEEDED"] #112

Open
robsanpam opened this issue Jan 11, 2023 · 1 comment

Comments

@robsanpam
Copy link

Hello,

While running an experiment with automl in Databricks RT 11.3ML I get the error:

Unable to generate notebook at [workspace location] using format JUPYTER: {"error_code": "MAX_NOTEBOOK_SIZE_EXCEEDED", "message": "File size imported is 34974148 bytes), exceeded max size (10485760 bytes)"}

The exact same code runs smoothly for datasets with more variables and more training instances but in other Databricks environments. However, in a particular environment, this error always comes up.

The learning task is a regression and I have tried reducing the amount of training instances from 20M (which I know they are automatically sampled during the automl initial steps) to 2K but it still generates a Juyter Notebook of 12MB (apparently bigger than the allowed maximum).

My first guess was that the pandas profiling step causes the error while rendering the output of a "big" dataset but I did manage to manually run the exact same pandas profiling notebook using the same train set dataframe inputed to the automl task.

Any help is appretiated because I'm not sure what else to do as the error comes in a phase of the process which I haven't accessed or modified.

image

@robsanpam
Copy link
Author

It seems like there is no problem when the Databricks sever has a few experiments but as that number start to increase, this error shows up.

I just got the same error in another Databricks server which didn't have any problems before, by runing the exact code that yesterday worked without a problem.

Anyone knows if there is a limit in the number of experiments/runs that can be logged or created?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant