Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a bootstrap command to auto-install packages #650

Merged
merged 1 commit into from
May 27, 2020

Conversation

owais
Copy link
Contributor

@owais owais commented May 5, 2020

This commit introduces a new boostrap command that is shipped as part of
the opentelemetry-auto-instrumentation package. The command detects
installed libraries and installs the relevant auto-instrumentation
packages.

@owais owais force-pushed the bootstrap-cmd branch 6 times, most recently from a85749a to 4d20855 Compare May 5, 2020 17:24
@owais owais marked this pull request as ready for review May 5, 2020 17:38
@owais owais requested a review from a team as a code owner May 5, 2020 17:39
@owais owais force-pushed the bootstrap-cmd branch 3 times, most recently from 57b9a53 to 4ecd50e Compare May 5, 2020 18:24
@owais owais force-pushed the bootstrap-cmd branch 5 times, most recently from 0bd266d to 25bbf76 Compare May 5, 2020 20:31
ocelotl
ocelotl previously requested changes May 5, 2020
Copy link
Contributor

@ocelotl ocelotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking comment on having hardcoded references to existing instrumentations. Also, please refrain from any single use variable or function, it makes it pretty hard to review code when you have to be jumping around the file.

Comment on lines 30 to 41
instrumentations = {
"dbapi": "opentelemetry-ext-dbapi>=0.6b0",
"flask": "opentelemetry-ext-flask>=0.6b0",
"grpc": "opentelemetry-ext-grpc>=0.6b0",
"requests": "opentelemetry-ext-http-requests>=0.6b0",
"mysql": "opentelemetry-ext-mysql>=0.6b0",
"psycopg2": "opentelemetry-ext-psycopg2>=0.6b0",
"pymongo": "opentelemetry-ext-pymongo>=0.6b0",
"pymysql": "opentelemetry-ext-pymysql",
"redis": "opentelemetry-ext-redis",
"sqlalchemy": "opentelemetry-ext-sqlalchemy",
"wsgi": "opentelemetry-ext-wsgi>=0.6b0",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not have hardcoded references to existing instrumentations. This list should be compiled from the available packages that implement the opentelemetry-instrumentor entry point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way to get this information at runtime or do you want to make it a build step and just automate generation of this list?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this can help. Let me know if it does, please 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it does. This can help me get a list of installed instrumentations at runtime right?

This script is supposed to be used to install the required instrumentation packages. If we were to automate this, I think the way to do it would be to create a dev script that walks over the ext packages, figures out which ones are auto-instrumentation packages, then figures out which libraries they instrument and then generate a .py file for bootstrap.py to import and use. Even if we do that, in my experience working with a similar command for SignalFx's Python instrumentation, we might need to pin specific versions of different instrumentation in different versions of the bootstrap script so it might not always be feasible to auto-generate such a list without adding some "hard-coded" annotations to instrumentations or bootstrap script.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a tricky problem. @owais can you file an issue to see if we can track a more automated way to do this, or just maybe include this in documentation for auto-instrumentation?

There's merit in hardcoding as well: we don't add magic auto-instrumentation until we've tested things sufficiently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

)


def _pip_check():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please refrain from single use functions? It forces the reader to look back in the file while reading this code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think single use functions are good, when you are reading the code you find the function and you know what it is doing without having to understand how it is done. If you remove them, you'll end up with a very long function having to understand all the implementation details at once.

Comment on lines 25 to 26
_OUTPUT_INSTALL = "install"
_OUTPUT_REQUIREMENTS = "requirements"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move these module-level constants inside run since they are only used there.

Comment on lines 170 to 203
cmd = {
_OUTPUT_INSTALL: _run_install,
_OUTPUT_REQUIREMENTS: _run_requirments,
}[args.output]
cmd(_find_installed_libraries())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cmd = {
_OUTPUT_INSTALL: _run_install,
_OUTPUT_REQUIREMENTS: _run_requirments,
}[args.output]
cmd(_find_installed_libraries())
{
_OUTPUT_INSTALL: _run_install,
_OUTPUT_REQUIREMENTS: _run_requirments,
}[args.output](_find_installed_libraries())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had that but it was just too much going on in a single line :)

return library in sys.modules or pkgutil.find_loader(library) is not None


def _find_installed_libraries():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single use function here

Comment on lines +121 to +147
pip_check = check_pipe.communicate()[0].decode()
pip_check_lower = pip_check.lower()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pip_check = check_pipe.communicate()[0].decode()
pip_check_lower = pip_check.lower()
pip_check_lower = check_pipe.communicate()[0].decode().lower()

@owais
Copy link
Contributor Author

owais commented May 5, 2020

Single use functions were used because they are all system calls and can be more easily substituted if needed for example, can be mocked in tests. Making system calls inline makes mocking harder and less clear. I think it is worth it wrapping system calls in their own functions but happy to re-consider.

@ocelotl
Copy link
Contributor

ocelotl commented May 5, 2020

Single use functions were used because they are all system calls and can be more easily substituted if needed for example, can be mocked in tests. Making system calls inline makes mocking harder and less clear. I think it is worth it wrapping system calls in their own functions but happy to re-consider.

Ok, making things easier to mock in tests is actually a good reason for single use functions, let's just try to use them only when strictly necessary 👍

"mysql": "opentelemetry-ext-mysql>=0.6b0",
"psycopg2": "opentelemetry-ext-psycopg2>=0.6b0",
"pymongo": "opentelemetry-ext-pymongo>=0.6b0",
"pymysql": "opentelemetry-ext-pymysql",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why some of them are missing the version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They hadn't been published on pypi yet.

Comment on lines 157 to 196
parser.add_argument(
"-o",
"--output",
choices=[_OUTPUT_INSTALL, _OUTPUT_REQUIREMENTS],
default=_OUTPUT_INSTALL,
help="""
install - uses pip to install the new requirements using to the currently active site-package.
requirements - prints out the new requirements to stdout. Output can be piped and appended to
a requirements.txt file.
""",
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to have the intermediate --output command?
opentelemetry-bootstrap -o install looks weird to me.

What about removing it and having something like the following?

opentelemetry-bootstrap --install
opentelemetry-bootstrap --requirements

Btw, should a call without arguments (opentelemetry-bootstrap) perform the installation? I got surprised by it, I run opentelemetry-bootstrap to see what was it about and suddenly it was performing a lot of installations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed we might have a more actions in future like updating requirements.txt file in place or support for other tools like pipenv, poetry, etc. Would --action/-a be a better name for the flag instead of --output/-o ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with --action/-a instead and change the default to printing requirements.

Comment on lines 66 to 68
OpenTelemetry auto-instrumentation packages often have traced libraries as instrumentation dependency
(e.g. flask for opentelemetry-ext-flask), so using -I on library could cause likely undesired Flask upgrade.
Using --no-dependencies alone would leave potential for nonfunctional installations.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please mind wrapping these long lines?

logger.info(
"Existing %s installation detected. Uninstalling.", package
)
_pip_uninstall(package)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we should uninstall packages, or at least, if it should be the default behavior without a flag for that. I can imagine an use case where the user has a working setup but he's missing the instrumentation for some library, he tries opentelemetry-bootstrap and boom, all his instrumentation packages are updated, this could break this setup.

I think in this case we should just print a message telling the user that the instrumentation is already installed and not actions will be performed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed that this logic is uninstalling the instrumentation even if the version to be installed is the same one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reason to uninstall is that pip does not update packages with pip install for some sources github being one. So if a dependency installs directly from github, pip install does not update to newest version. We are not referencing any packages to be installed directly from github right now and not sure if at least the production ever will. We can remove this if everyone thinks we'll always be referencing pypi packages only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

he tries opentelemetry-bootstrap and boom, all his instrumentation packages are updated, this could break this setup.

Well, the intention of the command is to install/update the dependencies. We can make dry-run the default action for the command and print the dependencies, and force users to issue --install flag to actually make changes to site-packages like you suggested in another comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made printing the requirements the default action. LMK what you think about that.

)


def _pip_check():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think single use functions are good, when you are reading the code you find the function and you know what it is doing without having to understand how it is done. If you remove them, you'll end up with a very long function having to understand all the implementation details at once.

Comment on lines 66 to 67
OpenTelemetry auto-instrumentation packages often have traced libraries as instrumentation dependency
(e.g. flask for opentelemetry-ext-flask), so using -I on library could cause likely undesired Flask upgrade.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to clarify it in a broader way. Do we want instrumentations to have install dependencies on the instrumented libraries?
If a user does pip install opentelemetry-ext-flask, should it update/install the Flask version installed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ocelotl ocelotl self-requested a review May 11, 2020 19:11
@ocelotl ocelotl dismissed their stale review May 11, 2020 19:12

There is a discussion regarding hardcoding, hardcoding may end up being the preferred approach. Dismissing to unblock this while the discussion is ongoing.

# relevant instrumentors and tracers to uninstall and check for conflicts for target libraries
libraries = {
"dbapi": ("opentelemetry-ext-dbapi",),
"flask": ("opentelemetry-ext-flask",),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that opentelemetry-ext-django is merged, can you add django as well?


_ACTION_INSTALL = "install"
_ACTION_REQUIREMENTS = "requirements"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be preferred to have a description of what this module does here as well as in the README so users will know what is going on and being installed automatically.

@owais owais force-pushed the bootstrap-cmd branch 2 times, most recently from 8c07b89 to d016505 Compare May 18, 2020 14:52
@owais
Copy link
Contributor Author

owais commented May 18, 2020

Thanks @ocelotl.

@open-telemetry/python-approvers PTAL. I've fixed issues raised by most of the comments or replied to them. Working on add docs for the new command now.

@owais
Copy link
Contributor Author

owais commented May 22, 2020

Took care of the remaining comments and updated autoinstrumentation package docs. PTAL.

@owais owais force-pushed the bootstrap-cmd branch 4 times, most recently from d181b78 to ab53d36 Compare May 26, 2020 09:38
Comment on lines 28 to 40
"dbapi": "opentelemetry-ext-dbapi>=0.8.8",
"django": "opentelemetry-ext-django>=0.8.8",
"flask": "opentelemetry-ext-flask>=0.8.8",
"grpc": "opentelemetry-ext-grpc>=0.8.8",
"requests": "opentelemetry-ext-requests>=0.8.8",
"jinja2": "opentelemetry-ext-jinja2>=0.8.8",
"mysql": "opentelemetry-ext-mysql>=0.8.8",
"psycopg2": "opentelemetry-ext-psycopg2>=0.8.8",
"pymongo": "opentelemetry-ext-pymongo>=0.8.8",
"pymysql": "opentelemetry-ext-pymysql>=0.8.8",
"redis": "opentelemetry-ext-redis>=0.8.8",
"sqlalchemy": "opentelemetry-ext-sqlalchemy>=0.8.8",
"wsgi": "opentelemetry-ext-wsgi>=0.8.8",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the version numbers correct here? I would expect them to be 0.8b0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea how they crept in. I think search/replace command messed them up. Fixed now. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still seeing 0.8.8 unless im missing something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. Should be fixed now. The push didn't go through last time.

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just blocking on the question of versions, otherwise this looks great!

@owais owais force-pushed the bootstrap-cmd branch 2 times, most recently from 0397b54 to 1823ff2 Compare May 26, 2020 21:48
Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@codeboten
Copy link
Contributor

@owais please update the changelog, then i'll merge it

This commit introduces a new boostrap command that is shipped as part of
the opentelemetry-auto-instrumentation package. The command detects
installed libraries and installs the relevant auto-instrumentation
packages.
@owais
Copy link
Contributor Author

owais commented May 27, 2020

Updated. @codeboten

@codeboten codeboten merged commit 334a534 into open-telemetry:master May 27, 2020
@owais owais deleted the bootstrap-cmd branch May 27, 2020 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs reviewers PRs with this label are ready for review and needs people to review to move forward.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants