Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add logging to libcudf #12637

Merged
merged 56 commits into from
Mar 10, 2023
Merged

Add logging to libcudf #12637

merged 56 commits into from
Mar 10, 2023

Conversation

vuule
Copy link
Contributor

@vuule vuule commented Jan 28, 2023

Description

Adds a global spdlog logger.
Logging should be done through CUDF_LOG_* macros declared in detail namespace.
Logging level and output file path can be set via environment variables.
Logging can also be configured directly through the singleton object using spdlog APIs (should only be used outside of libcudf).

Added sample (useful) logging to getenv_or.
Fixed a bug discovered through the log :) Environment variable was being read in each hostdevice_vector constructor call.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@vuule vuule added feature request New feature or request non-breaking Non-breaking change labels Jan 28, 2023
@vuule vuule self-assigned this Jan 28, 2023
@github-actions github-actions bot added conda libcudf Affects libcudf (C++/CUDA) code. labels Jan 28, 2023
@codecov
Copy link

codecov bot commented Jan 28, 2023

Codecov Report

❗ No coverage uploaded for pull request base (branch-23.04@e4557cb). Click here to learn what that means.
Patch has no changes to coverable lines.

❗ Current head f0ae3e2 differs from pull request most recent head 855fcc0. Consider uploading reports for the commit 855fcc0 to get more accurate results

Additional details and impacted files
@@               Coverage Diff               @@
##             branch-23.04   #12637   +/-   ##
===============================================
  Coverage                ?   43.53%           
===============================================
  Files                   ?      158           
  Lines                   ?    24868           
  Branches                ?        0           
===============================================
  Hits                    ?    10827           
  Misses                  ?    14041           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@github-actions github-actions bot added the CMake CMake build issue label Jan 31, 2023
@bdice
Copy link
Contributor

bdice commented Jan 31, 2023

FYI, we need to fix some issues with the way rmm is packaging/using spdlog. I have a PR here that moves in that direction, but more changes may be needed. Please don’t merge this PR until I can confirm that rmm is fixed up. See: rapidsai/rmm#1177

@jrhemstad
Copy link
Contributor

jrhemstad commented Feb 2, 2023

You may want to make the design a bit more flexible such that it doesn't assume one is always logging to a file.

@jlowe or @revans2 can correct me if I'm wrong, but iirc Spark wants to be able to log to something like cout/cerr. In the RMM logging adaptor we enabled this by adding a constructor that takes an arbitrary std::ostream: https://github.com/rapidsai/rmm/blob/58ed5c9c373e8859bb009b32cd1f7dd0522e1a7d/include/rmm/mr/device/logging_resource_adaptor.hpp#L78-L98

@jlowe
Copy link
Member

jlowe commented Feb 2, 2023

Spark wants to be able to log to something like cout/cerr

We'd like the ability to glue the cudf logging into the existing Spark/Java logging mechanisms, e.g.: log4j. Being able to specify a custom spdlog sink, ideally at runtime, would give us the flexibility to do that and more.

@vuule
Copy link
Contributor Author

vuule commented Feb 2, 2023

The singleton spdlog object is accessible via cudf::logger() (unless I broke something when moving definitions to .cpp). It can be used to fully configure logging, including a custom sink.

@vuule
Copy link
Contributor Author

vuule commented Feb 3, 2023

One caveat: the API that changes the sink is not thread-safe. @jlowe is this an issue?
Here's a sample of how logger can be modified at runtime (will add samples to PR through tests):

std::ostringstream oss;
// Divert the log to the string stream (not thread-safe!)
cudf::logger().sinks() = {std::make_shared<spdlog::sinks::ostream_sink_mt>(oss)};
// Only include the log message (no timestamp, etc.)
cudf::logger().set_formatter(
  std::unique_ptr<spdlog::formatter>(new spdlog::pattern_formatter("%v")));
// Only log warnings and worse
cudf::logger().set_level(spdlog::level::warn);

bdice
bdice previously requested changes Mar 1, 2023
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR appears to rely on rmm's spdlog/fmt transitively. I think we want to update our conda recipes and/or CPM logic to explicitly depend on spdlog/fmt? What should our approach be for downstream packages like cudf @robertmaynard ? I'd think the conda packages should be explicit dependencies but I'm not sure how to handle the CMake/CPM side.

@vuule vuule requested a review from bdice March 6, 2023 21:59
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fmt/spdlog conda packaging looks good. The fmt/spdlog CMake packaging looks good to me also, but I would like a second eye from @robertmaynard before merging.

@bdice bdice dismissed their stale review March 8, 2023 02:17

outdated

@robertmaynard
Copy link
Contributor

The fmt/spdlog conda packaging looks good. The fmt/spdlog CMake packaging looks good to me also, but I would like a second eye from @robertmaynard before merging.

Looks good to me

@harrism
Copy link
Member

harrism commented Mar 8, 2023

We should see if the numbers in rapidsai/rmm#1222 also accurately capture what happens to compile times here.

Has anything happened with this? I'm curious how urgent it is to change RMM's spdlog dependency to use the pre-compiled library.

@vuule vuule requested a review from bdice March 9, 2023 19:49
@vuule
Copy link
Contributor Author

vuule commented Mar 9, 2023

We should see if the numbers in rapidsai/rmm#1222 also accurately capture what happens to compile times here.

Has anything happened with this? I'm curious how urgent it is to change RMM's spdlog dependency to use the pre-compiled library.

I tried using the build metrics to evaluate the impact of spdlog dependency. There's a measurable impact on .so size, but compilation time seems hard to compare because of caching. The time difference does not look alarming given the difference between cache hit rates.
no logging
image
with logging
image

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of minor things but looks basically ready to go.

@@ -676,8 +678,11 @@ if(NOT USE_NVTX)
target_compile_definitions(cudf PUBLIC NVTX_DISABLE)
endif()

# Define RMM logging level
target_compile_definitions(cudf PUBLIC "RMM_LOGGING_LEVEL=LIBCUDF_LOGGING_LEVEL")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you would run into any issues. RMM only sets this for compiled components i.e. its tests and benchmarks. The main rmm code is header-only so there are no compiled components where SPDLOG_ACTIVE_LEVEL would be set to conflict with whatever libcudf sets.

cpp/cmake/thirdparty/get_spdlog.cmake Show resolved Hide resolved
cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md Outdated Show resolved Hide resolved
cpp/doxygen/developer_guide/DEVELOPER_GUIDE.md Outdated Show resolved Hide resolved
cpp/src/io/utilities/hostdevice_vector.hpp Show resolved Hide resolved
vuule and others added 3 commits March 9, 2023 16:29
cpp/CMakeLists.txt Outdated Show resolved Hide resolved
vuule and others added 2 commits March 10, 2023 12:31
Co-authored-by: Vyas Ramasubramani <vyas.ramasubramani@gmail.com>
@vuule vuule requested a review from vyasr March 10, 2023 20:32
@vuule vuule added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Mar 10, 2023
@vuule
Copy link
Contributor Author

vuule commented Mar 10, 2023

/merge

@rapids-bot rapids-bot bot merged commit f216c0b into rapidsai:branch-23.04 Mar 10, 2023
@vuule vuule deleted the fea-logger branch March 13, 2023 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge CMake CMake build issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants