📚 Anki Card Generator with LoRA-Finetuned LLM

Overview

This project aims to fine-tune a Large Language Model (LLM) using LoRA to automatically generate high-quality Anki flashcards from various knowledge sources (Wikipedia or Personal pdfs).

The system supports:

Automatic training data generation from PDFs and Wikipedia pages
LoRA fine-tuning
Flashcard generation from PDFs
Flashcard generation from Wikipedia pages

The goal is to produce concise, factual, and pedagogically effective Anki cards suitable for long-term learning.

Models used fro data generation and lora fine-tuning can be ajusted to the user's computing power.

My personal fine-tuned model is available on this link: https://huggingface.co/Guibibo/Mistral-7B-v0.3-FlashCards.

Features

LoRA fine-tuning
Domain-agnostic flashcard generation
PDF parsing and knowledge extraction
Wikipedia page ingestion
Synthetic training data generation
Anki-compatible outputs (APKG)
Modular end-to-end pipeline

Usage

Installation

pip install -r environment.yml

Data and cards generation

python3 -m main

JSON datasets combination

python3 -m data.combine_data data/revised/file1.jsonl data/revised/file2.jsonl ...

JSON datasets adaptation of generated data from LLama model to LoRA on a Mistral model

python3 -m data.convert_to_lora.py path/to/file

LoRA fine-tunning

python3 -m LoRA.run_lora

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LoRA		LoRA
data		data
decks		decks
docs		docs
models		models
.gitignore		.gitignore
ReadMe.md		ReadMe.md
cards_gen.py		cards_gen.py
data_gen.py		data_gen.py
environment.yml		environment.yml
extract_pdf.py		extract_pdf.py
main.py		main.py
settings.py		settings.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 Anki Card Generator with LoRA-Finetuned LLM

Overview

Features

Usage

Installation

Data and cards generation

JSON datasets combination

JSON datasets adaptation of generated data from LLama model to LoRA on a Mistral model

LoRA fine-tunning

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📚 Anki Card Generator with LoRA-Finetuned LLM

Overview

Features

Usage

Installation

Data and cards generation

JSON datasets combination

JSON datasets adaptation of generated data from LLama model to LoRA on a Mistral model

LoRA fine-tunning

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages