Mobile-first academic thesis browsing with a TikTok-style vertical feed. The project is intentionally lightweight: a React frontend, a small Express API, and adapter points for a Playwright-based YOKTez scraper.
+---------------------------+ +---------------------------+
| React / Vite frontend | | Optional OpenAI summary |
| | | generation |
| - snap-scrolling feed | +-------------+-------------+
| - search + bookmarks | |
| - abstract expansion | |
+-------------+-------------+ |
| /api |
v |
+-------------+-------------+ |
| Express API gateway |<---------------------+
| |
| - /feed |
| - /random-thesis |
| - /search |
| - /thesis/:id |
| - /thesis/:id/summary |
+-------------+-------------+
|
v
+-------------+-------------+
| YOKTez adapter layer |
| |
| - cache in memory / Redis |
| - maps scraper output |
| - calls Playwright server |
+-------------+-------------+
|
v
+---------------------------+
| YOKTez website via |
| Playwright scraper / MCP |
+---------------------------+
Returns one thesis object for the cold-start experience.
Returns a paginated feed:
{
"items": [{ "id": "tez-001", "title": "..." }],
"nextCursor": 4
}Returns search results shaped for the same feed card UI:
{
"query": "kuraklik",
"count": 1,
"items": [{ "id": "tez-003", "title": "..." }]
}Returns a normalized thesis detail record.
Generates a short mobile-friendly AI summary. Falls back to a local heuristic if no OPENAI_API_KEY is configured.
App
|- FloatingHeader
| |- SearchForm
|- Feed
| |- SnapScreen[]
| |- ThesisCard
| |- AbstractPanel
| |- KeywordChips
| |- SummaryPanel
| |- ActionBar
- Cache normalized thesis records by thesis id to avoid repeat scraping.
- Cache feed pages and search results for 10 to 30 minutes.
- Persist hot queries in Redis if traffic grows or the scraper runs remotely.
- Pre-warm
random-thesisand common search terms with a cron job. - Save AI summaries separately because they are deterministic enough to reuse.
- Review YOKTez terms of service, robots directives, and PDF access restrictions before scraping.
- Rate-limit scraper traffic aggressively and identify your service honestly if possible.
- Avoid exposing private or restricted theses; only surface metadata or documents that YOKTez already permits publicly.
- Make AI summaries clearly labeled as machine-generated and keep links to the original thesis.
- Provide a takedown or contact path in case content owners object to reuse.
- Replace the stub in
server/yoktezClient.jswith real calls to your Playwright scraper service. - Normalize scraper output into the shared thesis shape used by the frontend.
- Add Redis or SQLite caching once real scrape latency is known.
- Swap sample data for live feed generation and add better cursoring.
- Add authentication only if you later want synced bookmarks.
npm install
npm run devFrontend runs on http://localhost:5173 and proxies API calls to the Express server on http://localhost:3001.
The frontend can now talk to any hosted TezTok API by setting:
VITE_API_BASE_URL=https://your-api.example.comWithout that env var, local development still uses the Vite /api proxy.
Note: the live YOKTez scraping flow still belongs on a server or worker. The client app can be deployed independently, but the YOKTez session/cookie scraping layer should stay in a hosted backend.
The web app now builds as a PWA.
- The app shell and static assets are precached during the production build.
- Feed requests are runtime-cached, so previously opened screens can load again when you reopen the app without the local server running.
- Client-only providers such as Crossref, OpenAlex, Semantic Scholar, CORE, and direct arXiv can benefit from cached responses after they have been fetched once.
- YÖK Tez still needs a reachable server for new uncached data, because that provider depends on a backend scraper/API.
To test it locally:
npm run build
npm run startOpen the app once, let a few feed pages load, then stop the server or go offline and reopen the installed app to verify cached content still appears.