Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 0 additions & 39 deletions .claude/agents/corpus-fixer.md

This file was deleted.

13 changes: 11 additions & 2 deletions .claude/commands/corpus-loop.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
argument-hint: [corpus_slug]
description: uses Playwright MCP and the `corpus:view` to parse page elements
description: uses Playwright MCP and the `corpus` to parse page elements
---

- using Playwright MCP, navigate to `http://localhost:3001/corpus/$1/gitcasso`
Expand Down Expand Up @@ -54,4 +54,13 @@ If you see `"title": "TODO_TITLE"` or similar hardcoded `TODO` values in the JSO

- Don't hedge your bets and write lots of fallback code or strings of `?.`. Have a specific piece of data you want to get, use non-null `!` assertions where necessary to be clear about getting.
- If a field is empty, represent it with an empty string. Don't use placeholders when extracting data.
- The pages we are scraping are going to change over time, and it's easier to fix broken ones if we know exactly what used to work. If the code has lots of branching paths, it's harder to tell what it was doing.
- The pages we are scraping are going to change over time, and it's easier to fix broken ones if we know exactly what used to work. If the code has lots of branching paths, it's harder to tell what it was doing.

## Troubleshooting

- If you see `"spot": "NO_SPOT"` but expect an enhancer to match:
- Check console logs for enhancer attempts (e.g., `"eE examing url"`)
- Look for specific rejection reasons in the enhancer's `tryToEnhance` method
- If multiple enhancers are conflicting:
- Check the order of enhancer registration in `registries.ts`
- Ensure proper exclusion logic (e.g., checking for specific containers)
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ When the `textarea` gets removed from the page, the `TextareaRegistry` is notifi
## Testing

- `pnpm playground` gives you a test environment where you can tinker with the popup with various test data, supports hot reload
- `pnpm corpus:view` gives you recordings of various web pages which you can see with and without enhancement by the browser extension
- `pnpm corpus` gives you recordings of various web pages which you can see with and without enhancement by the browser extension

### Test Corpus

Expand Down Expand Up @@ -114,7 +114,7 @@ We maintain a corpus of test pages in two formats for testing the browser extens

#### Viewing Corpus Files

- Run `pnpm corpus:view` to start the test server at http://localhost:3001
- Run `pnpm corpus` to start the test server at http://localhost:3001
- Select any corpus file to view in two modes:
- **Clean**: Original page without extension
- **Gitcasso**: Page with extension injected for testing
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@
"playground": "vite --config vite.playground.config.ts",
"playground:build": "vite build --config vite.playground.config.ts",
"corpus:har:record": "tsx tests/corpus-har-record.ts",
"corpus:view": "tsx tests/corpus-view.ts"
"corpus": "tsx tests/corpus-view.ts"
},
"type": "module",
"version": "0.0.1"
Expand Down
79 changes: 43 additions & 36 deletions tests/corpus-view.ts
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ const WEBEXTENSION_POLYFILL_REPLACEMENT =
'console.warn("Webextension-polyfill check bypassed for corpus testing")'
const BROWSER_API_MOCKS =
'window.chrome=window.chrome||{runtime:{getURL:path=>"chrome-extension://gitcasso-test/"+path,onMessage:{addListener:()=>{}},sendMessage:()=>Promise.resolve(),id:"gitcasso-test"}};window.browser=window.chrome;'
const PERMISSIVE_CSP = "default-src 'self' 'unsafe-inline' 'unsafe-eval' data: blob: http: https:;"
const PERMISSIVE_CSP =
"default-src 'self' 'unsafe-inline' 'unsafe-eval' data: blob: http: https:; connect-src 'self' http: https:; script-src 'self' 'unsafe-inline' 'unsafe-eval';"

// UI Styles
const REBUILD_BUTTON_STYLES = `
Expand Down Expand Up @@ -125,19 +126,20 @@ app.get('/', async (_req, res) => {
const links = Object.entries(CORPUS)
.map(([key, entry]) => {
const description = entry.description
? `<div style="color: #666; font-size: 0.9em; margin-top: 5px;">${entry.description}</div>`
? `<div style="color: #666; font-size: 0.9em;">${entry.description}</div>`
: ''
return `
<li>
<div style="margin-bottom: 10px;">
<div style="font-weight: bold; color: #555;">${key}</div>
<div style="font-size: 0.9em; color: #888;">${entry.type.toUpperCase()}</div>
${description}
</div>
<div style="display: flex; gap: 10px;">
<a href="/corpus/${key}/clean" style="flex: 1; text-align: center;">🔍 Clean</a>
<a href="/corpus/${key}/gitcasso" style="flex: 1; text-align: center;">🚀 Gitcasso</a>
<div style="display: flex; justify-content: space-between; align-items: center;">
<div>
<strong>${key}</strong> <span style="color: #888;">${entry.type.toLowerCase()}</span>
</div>
<div style="display: flex; gap: 15px;">
<a href="/corpus/${key}/clean">🔍 clean</a>
<a href="/corpus/${key}/gitcasso">🚀 gitcasso</a>
</div>
</div>
${description}
</li>
`
})
Expand All @@ -150,36 +152,27 @@ app.get('/', async (_req, res) => {
<title>Corpus Viewer</title>
<style>
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
max-width: 700px;
margin: 50px auto;
padding: 20px;
font-family: system-ui, sans-serif;
max-width: 600px;
margin: 20px auto;
padding: 10px;
}
h1 { color: #333; }
ul { list-style: none; padding: 0; }
li { margin: 20px 0; padding: 20px; background: #f8f9fa; border-radius: 8px; border: 1px solid #e9ecef; }
h1 { margin-bottom: 10px; }
ul { list-style: none; padding: 0; margin: 0; }
li { margin: 4px 0; padding: 8px 12px; background: #f9f9f9; }
a {
display: block;
padding: 12px 20px;
background: #fff;
color: #0066cc;
text-decoration: none;
color: #333;
border-radius: 6px;
border: 1px solid #dee2e6;
transition: all 0.2s;
font-size: 0.9em;
}
a:hover:not([style*="pointer-events: none"]) { background: #e9ecef; transform: translateY(-1px); }
code { background: #f1f3f4; padding: 2px 4px; border-radius: 3px; font-size: 0.9em; }
a:hover { text-decoration: underline; }
</style>
</head>
<body>
<h1>📄 Corpus Viewer</h1>
<p>Select a recorded page to view:</p>
<h1>Corpus Viewer</h1>
<ul>${links}</ul>
<div style="margin-top: 40px; padding: 20px; background: #f8f9fa; border-radius: 8px; border-left: 4px solid #007acc;">
<h3>Corpus Types</h3>
<p><strong>HAR:</strong> Automated network captures of initial page loads</p>
<p><strong>HTML:</strong> Manual SingleFile captures of post-interaction states</p>
<div style="margin-top: 20px; padding: 10px; background: #f9f9f9;">
<strong>HAR:</strong> Network captures | <strong>HTML:</strong> Manual captures
</div>
</body>
</html>
Expand Down Expand Up @@ -237,6 +230,10 @@ app.get('/corpus/:key/:mode(clean|gitcasso)', async (req, res) => {

// Replace external URLs with local asset URLs
let html = mainEntry.response.content.text!

// Strip CSP headers that might block our injected scripts
html = stripCSPFromHTML(html)

domains.forEach((domain) => {
const escapedDomain = domain.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')
const regex = new RegExp(`https?://${escapedDomain}`, 'g')
Expand All @@ -245,6 +242,13 @@ app.get('/corpus/:key/:mode(clean|gitcasso)', async (req, res) => {
if (mode === 'gitcasso') {
html = injectGitcassoScriptForHAR(key, html)
}

// Set permissive headers for HAR corpus to allow rebuild requests
res.set({
'Content-Security-Policy': PERMISSIVE_CSP,
'X-Content-Type-Options': 'nosniff',
})

return res.send(html)
} else if (entry.type === 'html') {
// Handle HTML corpus
Expand Down Expand Up @@ -385,12 +389,15 @@ app.listen(PORT, () => {

// Strip CSP meta tags and headers from HTML that might block our scripts
function stripCSPFromHTML(html: string): string {
// Remove CSP meta tags
html = html.replace(/<meta[^>]*http-equiv\s*=\s*["']content-security-policy["'][^>]*>/gi, '')
html = html.replace(/<meta[^>]*name\s*=\s*["']content-security-policy["'][^>]*>/gi, '')
// Remove CSP meta tags - more comprehensive patterns
html = html.replace(/<meta[^>]*http-equiv\s*=\s*["']?content-security-policy["']?[^>]*>/gi, '')
html = html.replace(/<meta[^>]*name\s*=\s*["']?content-security-policy["']?[^>]*>/gi, '')

// Also match patterns where content-security-policy appears anywhere in the meta tag
html = html.replace(/<meta[^>]*content-security-policy[^>]*>/gi, '')

// Remove any other restrictive security meta tags
html = html.replace(/<meta[^>]*http-equiv\s*=\s*["']x-content-type-options["'][^>]*>/gi, '')
html = html.replace(/<meta[^>]*http-equiv\s*=\s*["']?x-content-type-options["']?[^>]*>/gi, '')

return html
}
Expand Down