feat: port 4 valuable PRs from YOUTUBE-EXTENSION (pre-archive salvage)#113
feat: port 4 valuable PRs from YOUTUBE-EXTENSION (pre-archive salvage)#113groupthinking merged 5 commits intomainfrom
Conversation
Ported from YOUTUBE-EXTENSION PRs #703, #706, #707, #708 which were opened by Copilot/Claude agents right before the repo was archived. ## PR #706 — Fix duplicate skeleton repo generation - Added _extract_video_id() and _build_title() helpers to code_generator.py - deployment_manager.py: replaced timestamp % 10000 with UUID4 suffix - Root cause fix for the 11 identical uvai-generated-project-* repos ## PR #703 — Skill Builder (deployment learning system) - New: src/youtube_extension/services/skill_builder.py (368 lines) - Records deployment outcomes and derives lessons via EMA weighting - Surfaces top-N lessons for injection into future LLM prompts ## PR #708 — Cloud-native services (Vertex AI + Firestore + Cloud Tasks) - New: src/youtube_extension/services/cloud/ (5 service modules) - Firestore shared state replacing in-memory dicts - Cloud Tasks async video processing queue with retry/backoff - Vertex AI Agent Builder integration with Google Embedded 2 - New: cloud_api_endpoints.py (/api/v3/* routes) - New: Dockerfile.cloudrun, infrastructure/cloudrun/setup.sh - New: docs/cloud-native-architecture.md ## PR #707 — Gemini Vision multimodal ingestion - Updated gemini_service.py with frame extraction + batch analysis - Updated enhanced_video_processor.py with visual context pipeline - Updated videopack/schema.py with VisualContext + VisualElement models - New: docs/GEMINI_VISION_INTEGRATION.md Closes YOUTUBE-EXTENSION PRs: #702, #703, #704, #705, #706, #707, #708, #709, #710, #711 Related: EventRelay #111 (secrets still needed)
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request ports four valuable PRs from the archived YOUTUBE-EXTENSION repository, focusing on enhancing the project's architecture and capabilities. It introduces cloud-native services, multimodal ingestion with Gemini Vision, and a fix for duplicate repository generation, significantly improving the project's scalability, intelligence, and robustness. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
There was a problem hiding this comment.
Code Review
This pull request is a significant contribution, porting several valuable features from an archived repository, including a new Skill Builder, cloud-native services for scalable deployment, and multimodal ingestion with Gemini Vision. The code is generally well-structured and includes new documentation and tests.
My review focuses on improving robustness, security, and maintainability. Key suggestions include:
- Fixing a critical bug in the Cloud Tasks integration where a URL mismatch would prevent tasks from being processed.
- Enhancing the security of the Cloud Tasks handler by using OIDC tokens instead of a simple header check.
- Improving the task handler's error handling to prevent unnecessary retries for permanent failures.
- Addressing a race condition in the Vertex AI service by avoiding state mutation in a singleton.
- Other suggestions focus on refactoring duplicated code, improving health checks, and making parsing logic more robust.
| # Verify request is from Cloud Tasks | ||
| if not x_cloudtasks_taskname: | ||
| logger.warning("Unauthorized task handler access attempt") | ||
| raise HTTPException( | ||
| status_code=403, | ||
| detail="Only Cloud Tasks can call this endpoint" | ||
| ) |
There was a problem hiding this comment.
The current security check for the Cloud Tasks handler relies on the x-cloudtasks-taskname header, which can be spoofed if the endpoint is publicly accessible. Since the Cloud Run service is deployed with --allow-unauthenticated, it's critical to use a more robust authentication mechanism. The recommended approach is to use OIDC tokens. You should configure Cloud Tasks to create tasks with an OIDC token and then validate this token in your FastAPI endpoint to ensure that requests are legitimate and originate from the correct service account.
| async def generate_structured_output( | ||
| self, | ||
| prompt: str, | ||
| schema: Dict[str, Any], | ||
| ) -> Dict[str, Any]: | ||
| """ | ||
| Generate structured JSON output from prompt. | ||
|
|
||
| Args: | ||
| prompt: User prompt | ||
| schema: JSON schema for output | ||
|
|
||
| Returns: | ||
| Structured data matching schema | ||
| """ | ||
| # Update model config with schema | ||
| original_config = self.agent_config | ||
| self.agent_config.response_schema = schema | ||
| self._initialize_model() | ||
|
|
||
| try: | ||
| response = await self.process_text(prompt) | ||
| # Parse JSON response | ||
| result = json.loads(response.text) | ||
| return result | ||
|
|
||
| finally: | ||
| # Restore original config | ||
| self.agent_config = original_config | ||
| self._initialize_model() | ||
|
|
There was a problem hiding this comment.
The generate_structured_output method modifies the shared self.agent_config and re-initializes self.model. Since this service is used as a singleton, this is not thread-safe and can lead to race conditions in a concurrent environment. A better approach is to pass a custom generation_config to the generate_content call, which it supports. You should modify process_text to accept an optional generation_config override and pass it to the model. This avoids mutating the shared service state.
|
|
||
| # Health check | ||
| HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \ | ||
| CMD python -c "import requests; requests.get('http://localhost:8000/health', timeout=5)" |
There was a problem hiding this comment.
The current healthcheck command doesn't verify the HTTP status code of the response. requests.get() will succeed even if the endpoint returns an error status like 500. It's better to use raise_for_status() to ensure the healthcheck fails on non-2xx responses.
CMD python -c "import requests; requests.get('http://localhost:8000/health', timeout=5).raise_for_status()"
| except Exception as e: | ||
| error_msg = f"Task processing failed: {str(e)}" | ||
| logger.error(error_msg) | ||
|
|
||
| # Update state with error | ||
| try: | ||
| firestore_service = await get_firestore_service() | ||
| await firestore_service.update_state( | ||
| payload.video_id, | ||
| status='failed', | ||
| error_message=error_msg | ||
| ) | ||
| except Exception as state_error: | ||
| logger.error(f"Failed to update error state: {state_error}") | ||
|
|
||
| raise HTTPException(status_code=500, detail=error_msg) |
There was a problem hiding this comment.
The task handler catches all exceptions and returns a 500 HTTP status code. This causes Cloud Tasks to retry the task, even for permanent failures (e.g., an invalid video URL). This can lead to unnecessary retries and wasted resources. You should differentiate between transient and permanent errors. For permanent errors, log the failure, update the state in Firestore, and return a 2xx status code to Cloud Tasks to acknowledge the task and prevent retries.
| def _extract_video_id(self, video_url: str) -> str: | ||
| """Extract video ID from YouTube URL""" | ||
| # Simple extraction - can be enhanced | ||
| if 'youtube.com/watch?v=' in video_url: | ||
| return video_url.split('v=')[1].split('&')[0] | ||
| elif 'youtu.be/' in video_url: | ||
| return video_url.split('youtu.be/')[1].split('?')[0] | ||
| else: | ||
| # Assume it's already an ID | ||
| return video_url |
There was a problem hiding this comment.
This _extract_video_id implementation is duplicated and less robust than the one introduced in src/youtube_extension/backend/code_generator.py. To avoid code duplication and ensure consistency, you should move the more robust version from code_generator.py into a shared utility module and use it in both places. The other implementation correctly uses urllib.parse to handle various URL formats more reliably.
There was a problem hiding this comment.
Pull request overview
Ports previously authored (pre-archive) functionality into the current codebase, adding cloud-native processing components and Gemini Vision “multimodal ingestion”, while also improving project/repo naming uniqueness during generation/deployment.
Changes:
- Add Gemini Vision visual-context extraction (schema + processor integration) and related documentation/tests.
- Introduce cloud-native processing modules (Firestore state, Cloud Tasks queue, Vertex AI agent service) plus Cloud Run setup artifacts.
- Fix duplicate repo generation by switching to UUID-based suffixes and improve generated project titles using YouTube video IDs.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_gemini_vision_integration.py | Adds schema + integration tests for visual-context ingestion. |
| tests/test_firestore_state.py | Adds unit tests for the Firestore state service. |
| src/youtube_extension/videopack/schema.py | Extends VideoPack schema with visual context models/fields. |
| src/youtube_extension/services/skill_builder.py | Adds a deployment “learning system” for persisting lessons/weights. |
| src/youtube_extension/services/cloud/vertex_ai_agent.py | Adds a Vertex AI “agent” wrapper (text/video/transcript + embeddings). |
| src/youtube_extension/services/cloud/firestore_state.py | Adds Firestore-backed state management with optional local caching. |
| src/youtube_extension/services/cloud/cloud_video_processor.py | Adds a cloud-oriented orchestrator for queued/sync processing (currently with placeholders). |
| src/youtube_extension/services/cloud/cloud_tasks_queue.py | Adds Cloud Tasks enqueueing and queue operations. |
| src/youtube_extension/services/cloud/init.py | Exposes cloud service APIs from the cloud package. |
| src/youtube_extension/services/cloud/README.md | Documents cloud services usage and setup. |
| src/youtube_extension/services/ai/gemini_service.py | Updates Gemini service implementation and adds frame extraction + frame analysis helpers. |
| src/youtube_extension/backend/enhanced_video_processor.py | Integrates Gemini Vision visual-context extraction into the enhanced pipeline and markdown output. |
| src/youtube_extension/backend/deployment_manager.py | Fixes repo-name collisions by using UUID suffixes. |
| src/youtube_extension/backend/code_generator.py | Improves project naming by deriving titles from video IDs when metadata/title missing. |
| src/youtube_extension/backend/cloud_api_endpoints.py | Adds (opt-in) FastAPI endpoints for cloud-native processing and Cloud Tasks callbacks. |
| infrastructure/cloudrun/setup.sh | Adds GCP infra bootstrap script (APIs, SA, Firestore, Tasks, secrets). |
| examples/cloud_services_example.py | Adds example usage for the new cloud services modules. |
| docs/cloud-native-architecture.md | Adds end-to-end cloud-native architecture documentation. |
| docs/GEMINI_VISION_INTEGRATION.md | Adds Gemini Vision integration documentation and usage details. |
| Dockerfile.cloudrun | Adds a Cloud Run-oriented Dockerfile for deploying the backend with cloud extras. |
| task = tasks_v2.Task( | ||
| http_request=tasks_v2.HttpRequest( | ||
| http_method=tasks_v2.HttpMethod.POST, | ||
| url=f"{self.service_url}/api/v2/process-video-task", | ||
| headers={ | ||
| "Content-Type": "application/json", | ||
| }, | ||
| body=video_task.to_json().encode(), | ||
| ) |
There was a problem hiding this comment.
The Cloud Tasks HTTP target is configured to call /api/v2/process-video-task, but the newly added task handler endpoint is /api/v3/process-video-task. This mismatch will cause all queued tasks to hit a 404. Update the task URL to the actual handler path (or align the handler route to what the queue sends).
There was a problem hiding this comment.
@copilot The Cloud Tasks HTTP target is configured to call /api/v2/process-video-task, but the newly added task handler endpoint is /api/v3/process-video-task. This mismatch will cause all queued tasks to hit a 404. Update the task URL to the actual handler path (or align the handler route to what the queue sends).
| self.project_id = project_id or os.getenv('GOOGLE_CLOUD_PROJECT') | ||
| self.location = location | ||
| self.agent_config = agent_config or AgentConfig() | ||
|
|
||
| # Initialize Vertex AI | ||
| vertexai.init(project=self.project_id, location=self.location) | ||
|
|
There was a problem hiding this comment.
If GOOGLE_CLOUD_PROJECT is not set and no project_id is passed, self.project_id can be None, and vertexai.init(project=None, ...) will fail with a less clear error later. Validate project_id during initialization and raise a clear ValueError (or default from env) so misconfiguration is caught early.
| async def _fetch_metadata(self, video_url: str) -> Dict[str, Any]: | ||
| """ | ||
| Fetch video metadata. | ||
|
|
||
| This should integrate with real YouTube Data API. | ||
| """ | ||
| # Placeholder - integrate with real implementation | ||
| logger.info(f"Fetching metadata for: {video_url}") | ||
|
|
||
| return { | ||
| 'title': 'Video Title', | ||
| 'channel': 'Channel Name', | ||
| 'duration': '10:30', | ||
| 'views': 1000, | ||
| 'description': 'Video description', | ||
| } | ||
|
|
There was a problem hiding this comment.
process_video_sync() currently uses placeholder implementations for metadata and transcript extraction (_fetch_metadata() / _extract_transcript() return hard-coded dummy values). If these endpoints are intended to be usable, this will produce incorrect results and pollute Firestore state. Either integrate the real YouTube metadata/transcript services here or raise NotImplementedError (and disable the routes) until the real implementation is wired in.
| ```bash | ||
| pip install -e .[cloud] | ||
| ``` |
There was a problem hiding this comment.
The Cloud services README instructs pip install -e .[cloud], but there is no cloud optional dependency group defined in pyproject.toml. Update the installation instructions to match the actual extras (or add a cloud extra) so users can follow this doc successfully.
| class VideoPackV0(BaseModel): | ||
| version: VPVersion = VPVersion.v0 | ||
| id: str = Field(default_factory=lambda: str(_uuid.uuid4())) | ||
| video_id: constr(strip_whitespace=True, min_length=3) | ||
| source_url: HttpUrl | None = None | ||
| source_url: Optional[HttpUrl] = None | ||
|
|
||
| transcript: Transcript | ||
| chapters: list[Chapter] = Field(default_factory=list) | ||
| keyframes: list[Keyframe] = Field(default_factory=list) | ||
| concepts: list[str] = Field(default_factory=list) | ||
| requirements: list[Requirement] = Field(default_factory=list) | ||
| code_snippets: list[CodeSnippet] = Field(default_factory=list) | ||
| code_cues: list[CodeCue] = Field(default_factory=list) | ||
| tasks: list[Task] = Field(default_factory=list) | ||
| artifacts: list[ArtifactRef] = Field(default_factory=list) | ||
| keyframes: List[Keyframe] = Field(default_factory=list) | ||
| concepts: List[str] = Field(default_factory=list) | ||
| requirements: List[Requirement] = Field(default_factory=list) | ||
| code_snippets: List[CodeSnippet] = Field(default_factory=list) | ||
| artifacts: List[ArtifactRef] = Field(default_factory=list) | ||
|
|
||
| # Stage 1: Multimodal Ingestion - Visual context from Gemini Vision | ||
| visual_context: Optional[VisualContext] = Field(None, description="Visual analysis from video frames") |
There was a problem hiding this comment.
This change removes the Chapter, CodeCue, and Task models and the corresponding VideoPackV0 fields, but src/youtube_extension/videopack/__init__.py still imports/exports Chapter, CodeCue, and Task. As-is, importing youtube_extension.videopack will raise an ImportError. Either keep these models in schema.py (even if deprecated) or update the package exports and any serialization/validation that expects these fields.
| cutoff_date = datetime.now(timezone.utc).timestamp() - (days * 24 * 60 * 60) | ||
| collection = self._get_collection() | ||
|
|
||
| # Query old states | ||
| query = collection.where('created_at', '<', cutoff_date) | ||
| docs = await query.get() |
There was a problem hiding this comment.
cleanup_old_states() compares created_at against a numeric UNIX timestamp (cutoff_date), but created_at is stored as an ISO-8601 string (see VideoProcessingState.to_dict() / create_state). This query will never match correctly (and may error depending on Firestore type coercion). Store created_at as a Firestore Timestamp/datetime (preferred) or compare against an ISO string cutoff computed with the same format.
| response = self.client.create_task( | ||
| request=tasks_v2.CreateTaskRequest( | ||
| parent=queue_path, | ||
| task=task, | ||
| ) |
There was a problem hiding this comment.
These Cloud Tasks operations are implemented as async def but call the synchronous tasks_v2.CloudTasksClient methods directly (e.g., create_task). This will block the event loop under load. Wrap the blocking client calls in asyncio.to_thread(...) (or switch to an async client) so FastAPI can continue serving concurrent requests.
| response = self.client.create_task( | |
| request=tasks_v2.CreateTaskRequest( | |
| parent=queue_path, | |
| task=task, | |
| ) | |
| response = await asyncio.to_thread( | |
| self.client.create_task, | |
| tasks_v2.CreateTaskRequest( | |
| parent=queue_path, | |
| task=task, | |
| ), |
| @@ -317,24 +280,6 @@ class GeminiResult: | |||
| error: Optional[str] = None | |||
|
|
|||
|
|
|||
| class _GenaiClientModelProxy: | |||
| """Thin wrapper around google.genai.Client that exposes generate_content() | |||
| so existing call sites (which expect the old GenerativeModel interface) | |||
| work with the new Client-based SDK.""" | |||
|
|
|||
| def __init__(self, client: Any, model_name: str): | |||
| self._client = client | |||
| self._model_name = model_name | |||
|
|
|||
| def generate_content(self, contents: Any, *, generation_config: Any = None, **kwargs: Any) -> Any: | |||
| return self._client.models.generate_content( | |||
| model=self._model_name, | |||
| contents=contents, | |||
| config=generation_config, | |||
| **kwargs, | |||
| ) | |||
|
|
|||
|
|
|||
| class GeminiService: | |||
| """ | |||
| Service for cloud-based vision-language processing using Google Gemini. | |||
| @@ -353,48 +298,46 @@ def __init__(self, config: Optional[GeminiConfig] = None): | |||
| self._model = None | |||
| self._use_vertex = False | |||
| self._is_initialized = False | |||
| self._model_cache: dict[str, Any] = {} | |||
| self._backend_cache: dict[str, str] = {} | |||
| self._vertex_cache: dict[str, bool] = {} | |||
| self._model_cache: Dict[str, Any] = {} | |||
| self._backend_cache: Dict[str, str] = {} | |||
| self._vertex_cache: Dict[str, bool] = {} | |||
| self._backend_kind: str = "gemini" | |||
|
|
|||
| # Initialize client on startup if credentials available | |||
| if self.is_available(): | |||
| self._verification_failed = False | |||
| self._initialize_client() | |||
| self._initialize_client() | |||
|
|
|||
There was a problem hiding this comment.
GeminiConfig no longer populates api_key / project_id from environment variables, and GeminiService.is_available() only checks self.config.api_key / self.config.project_id. This breaks existing call sites that construct GeminiConfig(...) without explicitly passing api_key (expecting GEMINI_API_KEY) and also breaks callers passing a plain dict config (e.g., GeminiService(self.config)) because GeminiService assumes a GeminiConfig instance. Consider: (1) setting api_key and project_id defaults from os.getenv(...) in GeminiConfig, and (2) allowing GeminiService.__init__ to accept dict by coercing it into GeminiConfig(**config).
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
@groupthinking I've opened a new pull request, #114, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@groupthinking I've opened a new pull request, #115, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@groupthinking I've opened a new pull request, #116, to work on those changes. Once the pull request is ready, I'll request review from you. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
@groupthinking I've opened a new pull request, #117, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@groupthinking I've opened a new pull request, #118, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@groupthinking I've opened a new pull request, #119, to work on those changes. Once the pull request is ready, I'll request review from you. |
What this does
Ports the 4 PRs from YOUTUBE-EXTENSION that had real, valuable code before the repo was archived. All 26 open PRs in YOUTUBE-EXTENSION have been closed.
Changes
time() % 10000Stats
Remaining