Simplify logs management #1696

ZanSara · 2021-11-04T16:56:53Z

Related to #1683 and #1538

With the Pipeline's debug functionality, the behavior of the root logger was modified in a way that affected some Haystack submodules' loggers. This PR should restore their original behavior.

Note: as you can see, the PR introduces a little block of code in each of the submodules. The code is identical, so a small function could be made out of it. However, I think hiding this configuration in a function might make less obvious how the loggers work for that specific submodule, which might cause subtle problems later. Opinions welcome 🙂

UPDATE: following a deeper look at the current status, we decided to simplify the whole logger's situation along the lines of #1714, and I'm updating this PR in that direction.

Checklist:

Drop the log capture feature from the Pipeline's debug function
Remove all handlers from all loggers
Set the root logger to WARNING to keep most of the underlying libraries quiet (Elasticsearch, FAISS, etc)
Set the haystack logger to INFO to see enough logs during the execution of tutorials and scripts
Remove custom log levels from any Haystack module
Normalize out usage of log levels across different modules, pushing some logs back to DEBUG and a few others up to INFO
Format all logs coming from haystack in a standard way

…file and configure the handlers properly

ZanSara · 2021-11-08T17:29:42Z

I dug a bit more in the loggers situation. I was hoping to find a simple solution, but right now what I proposed seems to be the simplest possible.

I have a couple of ideas for improving the current status:

The logs capture feature of the pipeline's debug mode was probably overkill: we might consider removing it with its related magic (even though I enjoyed a lot implementing it 😅)
We could review the log levels as such:
- Leave the root logger on WARNING, to avoid too much noise from transformers, faiss or elasticsearch
- Set the main haystack logger to INFO so that we can understand what's going on
- Remove all special setups from the loggers of the submodules
- Review the logs coming from the modeling part. A good amount of those should be just moved back to DEBUG.

With both the above points, the entire logs configuration code could be summarized in two lines in haystack/__init__.py:

logging.basicConfig(format="%(levelname)s - %(name)s -  %(message)s", datefmt="%m/%d/%Y %H:%M:%S", level=logging.WARNING)
logging.getLogger("haystack").setLevel(logging.INFO)

which I actually like a lot more than what I've created so far 😅

…aystack into single_module_logger_levels

ZanSara · 2021-11-09T15:56:21Z

Note for the review: I suggest cloning the branch and run a couple of tutorials to check if the output is too verbose (It is definitely way more verbose than it was before).

bogdankostic

Looking good to me! I ran a few tutorials and noticed that when we load a Reader model, we print device information twice, so I fixed that.

tholor

I tested with Tutorial 1 and the logs are back to the old, verbose state. Probably even a bit more verbose now.

As discussed, let's clean up a bit and adjust the log level of a few messages. Made some suggestions here. It should be quick to do this first iteration based on output of tutorial 1 now and then we can iterate on smaller log messages that catch out attention in the next weeks.

@brandenchan I was wondering if we need to update any information in the documentation about activating debug in pipe.run() as we dropped the debug_logs param here and are not returning the logs anymore. However, I couldn't find it in the docs anywhere. My best guess was this section but this is rather about adding debug information for custom nodes. Do we have this already documented anywhere else?

…>...' messages to DEBUG as well

ZanSara · 2021-11-10T17:09:54Z

I've been running a few more tutorials and noticed some other logs that might change level:

WARNING - haystack.document_stores.base - Duplicate Documents: Document with id '7ed12f389f7f085bb30c7d00abd26f81' already exists in index 'document' -> Goes to INFO?
WARNING - haystack.modeling.data_handler.processor - Currently no support in Processor for returning problematic ids. This is thrown unconditionally in

haystack/haystack/modeling/data_handler/processor.py

Line 1730 in 14515a8

logger.warning(f"Currently no support in Processor for returning problematic ids")

--> INFO?

tholor · 2021-11-11T08:13:12Z

Yeah, first one can go to info. Second to Debug even, I'd say

Move each haystack module's logger configuration into the respective …

247d975

…file and configure the handlers properly

ZanSara requested review from tholor and bogdankostic November 4, 2021 16:59

ZanSara mentioned this pull request Nov 8, 2021

Device and GPU settings no longer printed #1538

Closed

ZanSara changed the title ~~Fix loggers for selected Haystack modules~~ Simplify logs management Nov 9, 2021

ZanSara added 2 commits November 9, 2021 15:23

Implement most changes from #1714

0e98a38

Solve merge conflict with master

22425fe

ZanSara marked this pull request as draft November 9, 2021 14:32

ZanSara and others added 7 commits November 9, 2021 15:33

Remove accidentally committed git merge tags ':D

b10e3b2

Remove the debug logs capture feature

fc76a86

Add latest docstring and tutorial changes

e095b24

Remove more references to debug_logs

19af632

Fix issue with FARMReader that somehow made it to master

7e09886

Merge branch 'single_module_logger_levels' of github.com:deepset-ai/h…

9ee8cc4

…aystack into single_module_logger_levels

Add latest docstring and tutorial changes

41334b1

ZanSara marked this pull request as ready for review November 9, 2021 15:56

bogdankostic added 2 commits November 10, 2021 12:11

Add devices parameter to Inferencer

1190c1c

Fix mypy

e5322c7

bogdankostic approved these changes Nov 10, 2021

View reviewed changes

tholor requested changes Nov 10, 2021

View reviewed changes

ZanSara added 3 commits November 10, 2021 14:29

Change log of APEX message to DEBUG and lower the 'Starting <docstore…

3e4a53d

…>...' messages to DEBUG as well

Change log level of a few logs from modeling

9cc5234

Silence the transformers warning

91885c6

Remove empty line below the workers :)

a6ec3d9

bogdankostic mentioned this pull request Nov 10, 2021

Tutorial 5 fails on reader.eval() call #1732

Closed

Fix two more levels in the tutorials logs

86f410a

tholor approved these changes Nov 11, 2021

View reviewed changes

ZanSara linked an issue Nov 11, 2021 that may be closed by this pull request

Suppress useless warning from transformers about not loaded model weights #1597

Closed

tholor added the type:refactor Not necessarily visible to the users label Nov 11, 2021

ZanSara merged commit 42c8edc into master Nov 11, 2021

ZanSara deleted the single_module_logger_levels branch November 11, 2021 09:16

ZanSara mentioned this pull request Nov 11, 2021

Simplify logs management #1714

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify logs management #1696

Simplify logs management #1696

ZanSara commented Nov 4, 2021 •

edited

Loading

ZanSara commented Nov 8, 2021

ZanSara commented Nov 9, 2021 •

edited

Loading

bogdankostic left a comment

tholor left a comment

ZanSara commented Nov 10, 2021 •

edited

Loading

tholor commented Nov 11, 2021

Simplify logs management #1696

Simplify logs management #1696

Conversation

ZanSara commented Nov 4, 2021 • edited Loading

ZanSara commented Nov 8, 2021

ZanSara commented Nov 9, 2021 • edited Loading

bogdankostic left a comment

Choose a reason for hiding this comment

tholor left a comment

Choose a reason for hiding this comment

ZanSara commented Nov 10, 2021 • edited Loading

tholor commented Nov 11, 2021

ZanSara commented Nov 4, 2021 •

edited

Loading

ZanSara commented Nov 9, 2021 •

edited

Loading

ZanSara commented Nov 10, 2021 •

edited

Loading