Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(#1130): Cleanup rb namespace by refactoring client API #1160

Merged
merged 22 commits into from Mar 7, 2022

Conversation

dcfidalgo
Copy link
Contributor

@dcfidalgo dcfidalgo commented Feb 15, 2022

Closes #1130

This is a proposal for how to clean up the rb namespace when we do import rubrix as rb:

  • made RubrixClient a module, I named it api.py since it's basically the API if you want to interact with the server, but naming suggestions are welcome
  • moved the few functionalities that were in the rubrix.__init__.py to this module
  • rubrix.__init__.py just imports stuff that it wants to expose to the user, so it's the user-facing API

@dcfidalgo dcfidalgo force-pushed the refactor/clean_rb_namespace branch from d712000 to a23f966 Compare March 2, 2022 11:25
@dcfidalgo dcfidalgo marked this pull request as ready for review March 2, 2022 14:13
@dcfidalgo
Copy link
Contributor Author

Ok, this should be ready to review. Apart from the stuff written above, this PR also includes following:

  • I adopted the way the transformers library handles its namespace. Now we use a "lazy module" (rubrix.utils._LazyRubrixModule) that only imports stuff when they are requested. This makes the import rubrix as rb statement really fast and allows us to raise deprecated warnings when importing stuff that we want to remove from the rubrix namespace in the future.

  • Now we also use scm to write the version to a _version.py file when the package is installed. It's the recommended way and removes some overhead.

  • This PR also includes some major cleanup in the tests. I left the RubrixClient class for now for backward compatibility, although we don't use it anymore. We can remove it (+ tests) once we deprecate it for good.

@dcfidalgo dcfidalgo self-assigned this Mar 2, 2022
@dcfidalgo dcfidalgo added this to In progress in Release via automation Mar 2, 2022
@codecov
Copy link

codecov bot commented Mar 2, 2022

Codecov Report

Merging #1160 (c539b2a) into master (5404b9a) will decrease coverage by 0.40%.
The diff coverage is 94.85%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1160      +/-   ##
==========================================
- Coverage   95.28%   94.88%   -0.41%     
==========================================
  Files         124      127       +3     
  Lines        5130     5391     +261     
==========================================
+ Hits         4888     5115     +227     
- Misses        242      276      +34     
Flag Coverage Δ
pytest 94.88% <94.85%> (-0.41%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/rubrix/__init__.py 70.58% <70.58%> (-15.42%) ⬇️
src/rubrix/client/api.py 94.58% <94.58%> (ø)
src/rubrix/client/rubrix_client.py 80.58% <100.00%> (-13.46%) ⬇️
src/rubrix/labeling/text_classification/rule.py 100.00% <100.00%> (ø)
src/rubrix/metrics/commons.py 100.00% <100.00%> (ø)
src/rubrix/metrics/text_classification/metrics.py 100.00% <100.00%> (ø)
src/rubrix/metrics/token_classification/metrics.py 96.82% <100.00%> (-0.44%) ⬇️
src/rubrix/monitoring/asgi.py 87.65% <100.00%> (ø)
src/rubrix/server/app.py 100.00% <100.00%> (ø)
...ix/server/security/auth_provider/local/settings.py 100.00% <100.00%> (ø)
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5404b9a...c539b2a. Read the comment docs.

@frascuchon
Copy link
Member

Nice approach !!

Just a question thinking about future monitoring workflows.

Imagine the following use case that could be handled by client: "Logging record predictions in different Rubrix servers due to <place here the reason>". How could it be solved with the singleton module approach implemented in this PR ?

Could we define local sessions by combining rb.init and with sections?

import rubrix as rb

rb.log(....) # Publish to default server

with rb.init(my-rubrix-server) as session:
    session.log(....) # publish using contextualized session
# or 
with rb.init(my-rubrix-server):
    rb.log(...)   

@dcfidalgo
Copy link
Contributor Author

yeah, simultaneously logging to different Rubrix servers with the singleton approach is difficult, not sure how to do this ... however, local (sequential) sessions like you described above are definitely possible, one basically would restore the prior client when exiting the context (with statement). Do you want me to include this feature in this PR? Or better a follow-up one?

In a monitoring context, I would try to run the monitoring in different processes, but of course, this assumes that they do not depend on each other.

@frascuchon
Copy link
Member

Got it. The sequential approach can get complicated once we start using asynchronous logging.

To sumarize, it is difficult to solve this type of situation if we are not able to create and isolate different communications with the server. Let's discuss in a call about the possibilities

@dcfidalgo
Copy link
Contributor Author

Ok, I moved the api module functions into an Api class and wrapped a few of them (the ones we expose in the rubrix namespace) in module functions with the same names. The api.ACTIVE_API attribute stores the "active Api class" that is used when calling rb.log, rb.load, etc. But now you can also instantiate your own Api classes, as many as you want:

import rubrix as rb
from rubrix.client.api import Api

rb.log(...)
Api(...).log(...)

Release automation moved this from In progress to Review Mar 6, 2022
Copy link
Member

@frascuchon frascuchon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great !

We should think about contextualized api operations for metrics and labeling modules

@dcfidalgo
Copy link
Contributor Author

We should think about contextualized api operations for metrics and labeling modules

Yes, maybe move all the logic to the Api class and leave some module functions that basically point to these Api methods (like we do for the rb namespace).

@dcfidalgo dcfidalgo merged commit fd2186d into master Mar 7, 2022
Release automation moved this from Review to Ready to DEV QA Mar 7, 2022
@dcfidalgo dcfidalgo deleted the refactor/clean_rb_namespace branch March 7, 2022 09:31
@frascuchon frascuchon moved this from Ready to DEV QA to Approved DEV QA in Release Mar 8, 2022
@frascuchon frascuchon moved this from Approved DEV QA to Ready to DEV QA in Release Mar 8, 2022
@frascuchon frascuchon moved this from Ready to DEV QA to Ready to Release QA in Release Mar 25, 2022
@dcfidalgo dcfidalgo moved this from Ready to Release QA to Approved Release QA in Release Mar 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Release
Approved Release QA
Development

Successfully merging this pull request may close these issues.

[Client] Clean up the rubrix namespace
2 participants