Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lite: Fix the analytics module to use asyncio to work in the Wasm env #5045

Merged
merged 14 commits into from Aug 4, 2023

Conversation

whitphx
Copy link
Member

@whitphx whitphx commented Jul 31, 2023

Description

Following the discussion at #4967 (comment)

Technically, the current analytics module doesn't work in the Wasm mode because it relies on the threading module that doesn't work on the Pyodide runtime.
So in this PR they are replaced with asyncio and pyodide.http.pyfetch() for the Wasm env.

🎯 PRs Should Target Issues

Before your create a PR, please check to see if there is an existing issue for this change. If not, please create an issue before you create this PR, unless the fix is very small.

Not adhering to this guideline will result in the PR being closed.

Tests

  1. PRs will only be merged if tests pass on CI. To run the tests locally, please set up your Gradio environment locally and run the tests: bash scripts/run_all_tests.sh

  2. You may need to run the linters: bash scripts/format_backend.sh and bash scripts/format_frontend.sh

@vercel
Copy link

vercel bot commented Jul 31, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
gradio ✅ Ready (Inspect) Visit Preview Aug 4, 2023 1:00pm

@gradio-pr-bot
Copy link
Contributor

gradio-pr-bot commented Jul 31, 2023

🦄 change detected

This Pull Request includes changes to the following packages.

Package Version
gradio minor
  • Maintainers can select this checkbox to manually select packages to update.

With the following changelog entry.

Lite: Fix the analytics module to use asyncio to work in the Wasm env

Maintainers or the PR author can modify the PR title to modify this entry.

Something isn't right?

  • Maintainers can change the version label to modify the version bump.
  • If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.

@gradio-pr-bot
Copy link
Contributor

gradio-pr-bot commented Jul 31, 2023

All the demos for this PR have been deployed at https://huggingface.co/spaces/gradio-pr-deploys/pr-5045-all-demos


You can install the changes in this PR by running:

pip install https://gradio-builds.s3.amazonaws.com/4df38b4e52be5e958834fd8dabbe1569c2c745c6/gradio-3.39.0-py3-none-any.whl

@whitphx whitphx changed the title Fix the analytics module to use asyncio to work in the Wasm env Lite: Fix the analytics module to use asyncio to work in the Wasm env Jul 31, 2023
@@ -0,0 +1,5 @@
---
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can delete this changeset file

Copy link
Member Author

@whitphx whitphx Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, will delete it.

btw, it was added by the CI bot. Did it detect something wrong?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why it added two different changeset files! Will keep an eye out to see if it happens again

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I added this one, and the bot added the another one: .changeset/hungry-bobcats-own.md

Should only this file about @gradio/lite be deleted, or another one about the gradio Python package too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you don't need to add a changeset manually, the bot will take care of it. You can delete the changeset you created.

I don't think we have the automatic changeset support for @gradio/lite at the moment. Would be good to add it closer to the release

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.
While it may also work, I have added the changeset for @gradio/lite in each PR, like #4779 (comment), btw.

gradio/blocks.py Outdated Show resolved Hide resolved
@abidlabs
Copy link
Member

Nice @whitphx! Thanks for making this addition. Generally looks good.

I couldn't figure out why the test is failing? It looks like its trying to do analytics using the WASM approach, but this should not be the case since the tests don't run on pyiodide by default.

Btw it would be good to add a similar test to

def test_error_analytics_successful(self, mock_post, monkeypatch):

to ensure that the request is being made in WASM mode (ok to add pyiodide as a test requirement)

data["ip_address"] = get_local_ip_address()
try:
requests.post(url, data=data, timeout=5)
except (requests.ConnectionError, requests.exceptions.ReadTimeout):
pass # do not push analytics if no network


async def _do_wasm_analytics_request(url: str, data: dict[str, Any]) -> None:
data["ip_address"] = "No internet connection"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the current get_local_ip_address() doesn't work in the Wasm env, and even if we make its Wasm-compatible version, gathering users' public IP-address looks too much, I hardcoded the string literal representing the failure of IP address detection.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gathering users' public IP-address looks too much

What do you mean by looks too much. Do you mean that its too intrusive? Or that its not techincally feasible?

The reason that I ask is that main benefit of collecting analytics is that it allows us to measure daily, weekly, and monthly unique users which we might not be able to do if we don't collect IP addresses.

Copy link
Member Author

@whitphx whitphx Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt it's too intrusive.
In the case of Gradio-lite, this telemetry will be sent from clients' device immediately after they open the app page, where they don't have any chance to set analytics_enabled=false to opt-out it. I think some amount of people consider it as a bad behavior to automatically access the third-party API (https://checkip.amazonaws.com/) to get the IP address and send it as analytics data, under the recent situation where people are sensitive to data privacy.
However, getting IP addresses itself might be allowed as long as they are not logged or properly stored securely (I'm not sure), so I would like some advice from experts.

Copy link
Member Author

@whitphx whitphx Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit including the IP address fetcher (b77247c), though we should revert it if it looks not good anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice @whitphx. Let's keep it for now -- as part of #4226, we might replace the IP address with a hash of the IP address so that we can still keep a track of unique users without sending IP addresses.

@whitphx whitphx requested a review from abidlabs August 2, 2023 13:12
@abidlabs
Copy link
Member

abidlabs commented Aug 4, 2023

LGTM @whitphx! This looks good to go. Apologies this took a while to review

@abidlabs abidlabs merged commit 3b9494f into gradio-app:main Aug 4, 2023
12 checks passed
@whitphx whitphx deleted the wasm-compatible-analytics branch August 4, 2023 14:01
@whitphx
Copy link
Member Author

whitphx commented Aug 4, 2023

@abidlabs Thank you!

@pngwn pngwn mentioned this pull request Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants