-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic telemetry features #2314
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I added many comments but most are minor and a few are simply architectural. There's only one related to model_name_or_path
that might be worth looking into, and some concerns of mine about the generous use of except Exception: pass
. Nothing blocking though 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 🙂 As soon as the tests pass it can be merged. Good job! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
I'd however get rid of the Thread in fire_and_forget
as posthog already takes care of this and creating threads is a rather expensive operation (can take several 10s or 100s of ms). Note that currently we create and destroy a thread on any call.
Proposed changes:
The events contain an anonymous user id, which is a randomly generated
uuid
. There is no way to infer your identity from the user id or any other content of the event.To prevent revealing a user's identity, the following properties will never be used by telemetry:
You can have a look at the meta data that is shared about your setup by calling
print_telemetry_report()
fromhaystack/telemetry.py
. If you would like to inspect all information that telemetry shares when you use Haystack, you can enable writing all events to a log file by callingenable_writing_events_to_file()
. The default location of that file is~/.haystack/telemetry.log
.Here is an exemplary event that is sent when tutorial 1 is executed via running
Tutorial1_Basic_QA_Pipeline.py
.Users can opt-out of telemetry via the following options.
disable_telemetry()
within Python or by setting the environment variableHAYSTACK_TELEMETRY_ENABLED
to"False"
directly~/.bashrc
to permanently disable telemetry:export HAYSTACK_TELEMETRY_ENABLED=False
.~/.zshrc
.setx HAYSTACK_TELEMETRY_ENABLED "False"
.[Environment]::SetEnvironmentVariable("HAYSTACK_TELEMETRY_ENABLED","False","User")
.To Do
Make sure the CI does not generate events.All events sent by CI can now be filtered out viaexecution_env != "ci"
.Status (please check what you already did):