KCET Cutoff Analyzer

Context:

The Karnataka Common Entrance Test (KCET) is an annual entrance exam conducted by the Karnataka Examination Authority (KEA) for admission into various undergraduate courses in engineering, architecture, pharmacy, agriculture, and other allied courses offered by colleges in the state of Karnataka.

Pain Point

The default cutoff list from KEA is huge and confusing. Creating a custom table for specific filters would help in the Application process. I had this pain point while I was applying and recently when my brother was.

The idea here is to act like a filter and get consistent data. But the major setback is parsing the damn cutoff PDF, which is extremely tight, has less padding, and in weird table format.

Tech Stack

I wanted to keep tech-stack simple, as I wanted to focus more on improving PDF table parsing. I have used Vercel and DigitalOcean for my deployments

Frontend: Vite, React, Tailwind, Primereact and React Router
Backend: Python, Frappe, Redis, MariaDB

Solution

Now I started with tabula-py, but it missed a lot of rows. For starters, it had issues where for a single row that has 20 columns, 4 columns data will be in 1 row, then some x columns data will be in 2 rows, and so on. No pattern, pure randomness. Tried writing a logic for it, but also explored other alternatives simultaneously.

The second approach was using OCR, the output was decent but required some cleanup. But it in some way resolved the value aggregation issue. So the idea was to use OCR and convert PDF tables to CSV sheets.

But there’s one more problem, if you don’t scale down the table(in Google Sheets) and export, you’ll have value aggregation again, which seems like a nightmare to solve. So I scaled it down to 70-80% and then exported it in A3 Landscape sheet format with custom page breakups.

Now using this CSV sheet, I was able to populate my DB. There are issues still with consistency, looking for other alternatives. But this has to go with MVP. I want something to be live and get critical feedback on my work. Don’t wanna waste time figuring out an optimized approach and then release. I thought if I could set up the pipeline, I would be able to improve data quality later on.

And I did the same, after 2 months of effort, the app is in the MVP stage. So I have planned to deploy it. I plan to improve things, like exporting the cutoff sheet as CSV and exposing CRUD APIs to allow development on top of existing KCET Cutoff Data.

Result

The output columns need to be verbose, which I am improving.

Note:

Right now, it is not open for open-source development. As I have to update the development backend for debug/test. I am planning to add that once the project gets some traction or users. Here's the link to Backend repository

ToDos:

Make a mono repo of both Frontend and Backend
Create a Development setup and expose cutoff APIs

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github/workflows		.github/workflows
assets		assets
src		src
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.prettierrc.js		.prettierrc.js
LICENSE		LICENSE
README.md		README.md
commitlint.config.js		commitlint.config.js
debug.env		debug.env
index.css		index.css
index.html		index.html
jest.config.js		jest.config.js
package.json		package.json
postcss.config.js		postcss.config.js
prod.env		prod.env
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
vercel.json		vercel.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KCET Cutoff Analyzer

Context:

Pain Point

Tech Stack

Solution

Result

Note:

ToDos:

About

Releases

Packages

Contributors 2

Languages

License

rohansh-tty/Kcet_Frontend

Folders and files

Latest commit

History

Repository files navigation

KCET Cutoff Analyzer

Context:

Pain Point

Tech Stack

Solution

Result

Note:

ToDos:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages