-
Notifications
You must be signed in to change notification settings - Fork 56
[wip] attempt to anonymize transcripts #433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
d055a02
e168276
fd32c3e
d6344df
26505ca
8eb83ef
d641687
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| # Example Environment Variables for Lightspeed Stack | ||
| # Copy this file to .env and set appropriate values | ||
|
|
||
| # Required: User anonymization pepper (set to a secure random value) | ||
| # This is used for HMAC-based user ID hashing to protect user privacy | ||
| USER_ANON_PEPPER=your-secure-random-string-here | ||
|
|
||
| # Optional: OpenAI API Key (if using OpenAI models) | ||
| OPENAI_API_KEY=your-openai-api-key | ||
|
|
||
| # Optional: Other environment variables as needed for your configuration |
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -21,6 +21,7 @@ | |||||||||||||||||||||||||||||
| ForbiddenResponse, | ||||||||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||||||||
| from utils.suid import get_suid | ||||||||||||||||||||||||||||||
| from utils.user_anonymization import get_anonymous_user_id | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| logger = logging.getLogger(__name__) | ||||||||||||||||||||||||||||||
| router = APIRouter(prefix="/feedback", tags=["feedback"]) | ||||||||||||||||||||||||||||||
|
|
@@ -134,7 +135,8 @@ def store_feedback(user_id: str, feedback: dict) -> None: | |||||||||||||||||||||||||||||
| user_id (str): Unique identifier of the user submitting feedback. | ||||||||||||||||||||||||||||||
| feedback (dict): Feedback data to be stored, merged with user ID and timestamp. | ||||||||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||||||||
| logger.debug("Storing feedback for user %s", user_id) | ||||||||||||||||||||||||||||||
| anonymous_user_id = get_anonymous_user_id(user_id) | ||||||||||||||||||||||||||||||
| logger.debug("Storing feedback for anonymous user %s", anonymous_user_id) | ||||||||||||||||||||||||||||||
| # Creates storage path only if it doesn't exist. The `exist_ok=True` prevents | ||||||||||||||||||||||||||||||
| # race conditions in case of multiple server instances trying to set up storage | ||||||||||||||||||||||||||||||
| # at the same location. | ||||||||||||||||||||||||||||||
|
|
@@ -144,7 +146,11 @@ def store_feedback(user_id: str, feedback: dict) -> None: | |||||||||||||||||||||||||||||
| storage_path.mkdir(parents=True, exist_ok=True) | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| current_time = str(datetime.now(UTC)) | ||||||||||||||||||||||||||||||
| data_to_store = {"user_id": user_id, "timestamp": current_time, **feedback} | ||||||||||||||||||||||||||||||
| data_to_store = { | ||||||||||||||||||||||||||||||
| "anonymous_user_id": anonymous_user_id, | ||||||||||||||||||||||||||||||
| "timestamp": current_time, | ||||||||||||||||||||||||||||||
| **feedback, | ||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||
|
Comment on lines
148
to
+153
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. π οΈ Refactor suggestion Prevent client payload from overriding server-controlled fields + use ISO 8601 timestamps. With the current dict merge order, keys in Apply this diff: - current_time = str(datetime.now(UTC))
- data_to_store = {
- "anonymous_user_id": anonymous_user_id,
- "timestamp": current_time,
- **feedback,
- }
+ current_time = datetime.now(UTC).isoformat()
+ # Ensure server-controlled fields cannot be overridden by client payload
+ data_to_store = {
+ **feedback,
+ "anonymous_user_id": anonymous_user_id,
+ "timestamp": current_time,
+ }π Committable suggestion
Suggested change
π€ Prompt for AI Agents |
||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| # stores feedback in a file under unique uuid | ||||||||||||||||||||||||||||||
| feedback_file_path = storage_path / f"{get_suid()}.json" | ||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,32 @@ | ||
| """User ID anonymization mapping model.""" | ||
|
|
||
| from datetime import datetime | ||
|
|
||
| from sqlalchemy.orm import Mapped, mapped_column | ||
| from sqlalchemy import DateTime, func, Index, String | ||
|
|
||
| from models.database.base import Base | ||
|
|
||
|
|
||
| class UserMapping(Base): # pylint: disable=too-few-public-methods | ||
| """Model for mapping real user IDs to anonymous UUIDs.""" | ||
|
|
||
| __tablename__ = "user_mapping" | ||
|
|
||
| # Anonymous UUID used for all storage/analytics (primary key) | ||
| anonymous_id: Mapped[str] = mapped_column( | ||
| String(36), primary_key=True, nullable=False | ||
| ) | ||
|
|
||
| # Original user ID from authentication (hashed for security) | ||
| user_id_hash: Mapped[str] = mapped_column( | ||
| String(64), index=True, unique=True, nullable=False | ||
| ) | ||
|
|
||
| created_at: Mapped[datetime] = mapped_column( | ||
| DateTime(timezone=True), | ||
| server_default=func.now(), # pylint: disable=not-callable | ||
| ) | ||
|
|
||
| # Index for efficient lookups | ||
| __table_args__ = (Index("ix_user_mapping_hash_lookup", "user_id_hash"),) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
π οΈ Refactor suggestion
Remove insecure default pepper fallback.
Providing a default pepper invites accidental production deployments with a shared key, breaking privacy guarantees and cross-environment isolation.
Use a required variable and fail fast if itβs missing (compose will error):
Optional (outside this line): keep a dev-only override in docker-compose.override.yaml or an
.envfile that is not committed.π Committable suggestion
π€ Prompt for AI Agents