-
Notifications
You must be signed in to change notification settings - Fork 11
Added language detection and KV-based usage analytics #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added language detection and KV-based usage analytics #51
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a comprehensive analytics system for the code translation feature, tracking usage patterns through Cloudflare Workers KV storage. It implements automatic source language detection using Google Gemini AI and provides an analytics endpoint for monitoring translation usage.
- Added KV-based analytics tracking that records source-target language pairs with usage counts
- Implemented automatic source language detection for submitted code using AI
- Created a new
/v1/analytics
endpoint to retrieve usage statistics
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
backend/wrangler.jsonc | Added KV namespace binding for analytics storage |
backend/src/index.ts | Implemented analytics tracking, language detection, and analytics endpoint |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
backend/src/index.ts
Outdated
async function handleTranslate(request: Request, model: ReturnType<GoogleGenerativeAI['getGenerativeModel']>) { | ||
|
||
async function updateAnalytics(source: string, dest: string, env: Env) { | ||
const key = `${source}-${dest}`; |
Copilot
AI
Oct 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The analytics key should be normalized (lowercase and trimmed) to avoid duplicates, similar to how the rate limit key is handled on line 19.
const key = `${source}-${dest}`; | |
const normalizedSource = source.trim().toLowerCase(); | |
const normalizedDest = dest.trim().toLowerCase(); | |
const key = `${normalizedSource}-${normalizedDest}`; |
Copilot uses AI. Check for mistakes.
backend/src/index.ts
Outdated
${code}`; | ||
|
||
const result = await model.generateContent(prompt); | ||
return result.response.text().trim(); |
Copilot
AI
Oct 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The detectLanguage function should normalize the returned language name to lowercase to ensure consistent analytics keys, preventing duplicate entries like 'Python' vs 'python'.
return result.response.text().trim(); | |
return result.response.text().trim().toLowerCase(); |
Copilot uses AI. Check for mistakes.
backend/src/index.ts
Outdated
const stats: Record<string, any> = {}; | ||
for (const key of list.keys) { | ||
const val = await env.LANG_TRANSLATION_ANALYTICS.get(key.name); | ||
stats[key.name] = JSON.parse(val || '{}'); |
Copilot
AI
Oct 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON.parse should be wrapped in a try-catch block to handle potential parsing errors, similar to the error handling in updateAnalytics function.
stats[key.name] = JSON.parse(val || '{}'); | |
try { | |
stats[key.name] = JSON.parse(val || '{}'); | |
} catch (e) { | |
console.error(`Failed to parse analytics value for key "${key.name}":`, e); | |
stats[key.name] = {}; | |
} |
Copilot uses AI. Check for mistakes.
Hey @dineshsutihar , I have made changes according to co-pilot review , you can merge it now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, @hemanth5055! I'm going to merge it now.
In the future, I think we can optimize this by sending the source language from the extension, so we won't need to call Gemini to detect the language.
Hey @dineshsutihar, {
"translation": "",
"source_language": ""
} This way, we can eliminate the extra API call that’s currently used just for language detection. |
Yes, that sounds like a solid approach, @hemanth5055. Please go ahead and raise a new issue for it. |
Description:
This PR introduces a robust analytics and language detection system for the code translation feature. It provides real-time tracking of translation usage and enables detailed insights into which language pairs are most frequently used. The main updates include:
KV-based Analytics Tracking:
LANG_TRANSLATION_ANALYTICS
).Source Language Detection:
/v1/analytics
Endpoint:source-target
language pairs and values indicate their usage counts.Testing:
Verified that translation requests increment analytics counts correctly in KV.
Tested the
/v1/analytics
endpoint to ensure all keys and counts are returned accurately.Confirmed that source language detection reliably identifies programming languages for various code snippets.
Impact:
Next Steps / Suggestions for Reviewers:
Before testing or running the Worker locally, please add your KV namespace ID for LANG_TRANSLATION_ANALYTICS in your wrangler.json under the KV binding section.
closes #20