Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent logging at the kedro project level #737

Closed
roumail opened this issue Mar 25, 2021 · 13 comments
Closed

Prevent logging at the kedro project level #737

roumail opened this issue Mar 25, 2021 · 13 comments

Comments

@roumail
Copy link

roumail commented Mar 25, 2021

I'm seeing a number of warnings of the type below but I can't seem to figure out how to stop kedro from logging these to std console

/envs/kedro_test/lib/python3.7/site-packages/sklearn/linear_model/_least_angle.py:34: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.

I thought the snippet below might work if I add it to the cli.py but it didnt' do the job.. What do I seem to be missing here?

import warnings
# Added because the numpy and sklearn versions class for np.bool type of checks, see below
warnings.filterwarnings("ignore", category=DeprecationWarning,
                        # module="numpy|sklearn|mlflow",
                        message="is a deprecated alias for the builtin")

Warning control on a specific function, module is pretty straightforward. However, this deprecation warning is completely taking over the log file so that we can't see anything else.

Do you have suggestions where to place these warning filters?

@roumail roumail added the Issue: Bug Report 🐞 Bug that needs to be fixed label Mar 25, 2021
@roumail
Copy link
Author

roumail commented Mar 25, 2021

Sorry the label is automatically added, this is less of a bug report but rather a request for information. I'm not sure where to go to and I've already been through the documentation for logging. The logging part is not so informative:
https://kedro.readthedocs.io/en/stable/08_logging/01_logging.html#logging

PS, is this somewhere that hooks might help??

@roumail
Copy link
Author

roumail commented Mar 26, 2021

Using the following transformer Hook, I'm able to prevent the spurious messages polluting the logs..

class TransformerHooks:
    @hook_impl
    def after_catalog_created(self, catalog: DataCatalog) -> None:
        """
        Check that the key name of the catalog entry matches the file name on s3 to avoid loading unexpected
        files from the catalog.
        """
        # Added because the numpy and sklearn versions class for np.bool type of checks
        warnings.simplefilter("ignore", category=DeprecationWarning)

However, my log is still polluted by these messages prior to the "catalog" creation. I'm at a loss as to how I can access the logging that is done prior to this. Any helpful pointers would be appreciated..

/root/user_files/envs/kedro_test/lib/python3.7/site-packages/sklearn/decomposition/_lda.py:28: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  EPS = np.finfo(np.float).eps
/root/user_files/envs/kedro_test/lib/python3.7/site-packages/sklearn/ensemble/_gb.py:33: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  from ._gradient_boosting import predict_stages
/root/user_files/envs/kedro_test/lib/python3.7/site-packages/mlflow/types/schema.py:49: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  binary = (7, np.dtype("bytes"), "BinaryType", np.object)
.....
100 more lines like that
....

Finally, spark initialization.. 
...
2021-03-26 05:21:41,045 - kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore`. Assuming empty store.
Ivy Default Cache set to: /root/.ivy2/cache
The jars for the packages stored in: /root/.ivy2/jars
:: loading settings :: url = jar:file:/usr/local/spark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.hadoop#hadoop-aws added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-752b1b76-5c98-4701-a3f4-4d85629a90b0;1.0
        confs: [default]
        found org.apache.hadoop#hadoop-aws;3.1.1 in central
        found com.amazonaws#aws-java-sdk-bundle;1.11.271 in central
:: resolution report :: resolve 134ms :: artifacts dl 3ms
        :: modules in use:
        com.amazonaws#aws-java-sdk-bundle;1.11.271 from central in [default]
        org.apache.hadoop#hadoop-aws;3.1.1 from central in [default]
        ---------------------------------------------------------------------
        |                  |            modules            ||   artifacts   |
        |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
        ---------------------------------------------------------------------
        |      default     |   2   |   0   |   0   |   0   ||   2   |   0   |
        ---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent-752b1b76-5c98-4701-a3f4-4d85629a90b0
        confs: [default]
        0 artifacts copied, 2 already retrieved (0kB/5ms)
2021-03-26 05:21:43,144 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
...
Node logging begins

@roumail
Copy link
Author

roumail commented Apr 7, 2021

Hello again - I saw that 0.17.3 has added the hook before_command_run . Do I understand correctly that this hook should allow me to apply my current workaround to suppress logging all the superfluous logging?

https://github.com/quantumblacklabs/kedro/blob/develop/RELEASE.md#upcoming-release-0173

@lorenabalan
Copy link
Contributor

Hi @roumail that's a good point the new hook might be of help here. The new version should be out today.

I thought the snippet below might work if I add it to the cli.py but it didnt' do the job..

Could you clarify where you've added? I'd expect if it's in the run() command, before creating the KedroSession, it would've still worked.

Alternatively you might also look at configuring logging.captureWarnings and having a separate handler for that, but it might be a bit overkill.

@lorenabalan lorenabalan added Issue: Question and removed Issue: Bug Report 🐞 Bug that needs to be fixed labels Apr 21, 2021
@roumail
Copy link
Author

roumail commented Apr 22, 2021

Hi @lorenabalan ,

Good to hear that the new version will be out today! The snippet I used to silence some of the warnings is given below. Basically, I added a TransformerHook in src/{{ project_name }}/hooks.py.

class TransformerHooks:
    @property
    def _logger(self):
        return logging.getLogger(self.__class__.__name__)

    @hook_impl
    def after_catalog_created(self, catalog: DataCatalog) -> None:
 
        # Added because the numpy and sklearn versions clash for np.bool type of checks
        warnings.simplefilter("ignore", category=DeprecationWarning)

@lorenabalan
Copy link
Contributor

I meant where did you try in cli.py and didn't work - it was just my curiosity. Anw it sounds like this is resolved, okay if we close this issue?

@roumail
Copy link
Author

roumail commented Apr 22, 2021

Oh sorry to misunderstand. We were adding it in the cli.py file for the project, adapting the click function run where we also have the kedro session created.

def run(
    tag,
    env,
    parallel,
    runner,
    is_async,
    node_names,
    to_nodes,
    from_nodes,
    from_inputs,
    load_version,
    pipeline,
    config,
    params,
):

If I understand you well, perhaps we should have adapted the run.py file instead?

Yes, feel free to close the issue! Since the latest version has a convenient way to do this now

@lorenabalan
Copy link
Contributor

Ah that seemed right to me, that's where I would've put it as well. Anyway, glad the new version makes it even easier to enable this behaviour. I'll close this issue as resolved.

@marcosfelt
Copy link

I seem to be having this same problem now with versions 0.18 and above, and the above solutions do not work. Any ideas on changes that need to be made?

@matheus695p
Copy link

@lorenabalan any ideas here, in versions later than 0.18 it is not possible to disable external warnings as proposed above

@antonymilne
Copy link
Contributor

@marcosfelt @rodra-go @JAAdrian @klaerik @luizvbo #2184 contains the most up to date information on this now. Please do take a look and see if the solution there helps and if not we'll continue troubleshooting there.

@Kurdzik
Copy link

Kurdzik commented Apr 24, 2023

Hi, in my case (im using kedro 0.18.7) worked putting

import warnings
warnings.simplefilter("ignore", category=DeprecationWarning)

In src/settings.py to prevent any unnecessary warnings logging

@noklam
Copy link
Contributor

noklam commented Apr 24, 2023

@Kurdzik This is always an option, but if you need finer control you should use logging.yml

This will work better after we merged a series of PR including #2535

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants