Edulytica

Description

The purpose of the application is to automate the analysis of scientific and educational documents in the context of research works using LLM (Large language model, large language models) to reduce the time and intellectual costs of teachers. The result is two LLMs, trained on specially collected data, capable of summarizing the text of a large document and revealing whether the stated goals and objectives of the work have been achieved.

Features

Algorithms for summarizing large texts and evaluating the achievement of stated goals and objectives;
Fast APP is an application for interacting with trained models;
Separate models have been trained to summarize and assess the achievability of goals and objectives;
Datasets have been prepared for training models, separately for summarization, separately for goals and objectives.

Please help us improve this project, share your feedback with opening issue!

Installation

1. Clone the repository

git clone https://github.com/LISA-ITMO/Edulytica.git

2. Activate venv

source ~/PyProject/Edulytica/api_venv/bin/activate

3. Install requirements

pip install -r requirements.txt

4. Start Application

python3 src/edulytica_api/app.py

5. Activate Celery

celery -A src.edulytica_api.celery.tasks worker --loglevel=info -E -P gevent

6. Run npm

npm start

7. Run Celery task

celery -A src.edulytica_api.celery.tasks flower

Getting started

First, you can familiarize yourself with the examples in JSON format of the system's responses to the test sample of works.

When you have managed to launch the service, you can send the documents yourself and get acquainted with the results of their verification!

Documentation

Details of the documentation can be found at the links below:

algorithms - part of the task of analyzing the text how much it is necessary to change the source text (which is written by AI) so that AI recognition systems do not recognize AI in this text;
data_handling - an auxiliary module that stores parsers of data and documents for generating datasets;
edulytica_api - this module stores the source code of the web service;
extracting_rules - This module is devoted to an experiment with extracting design rules using LLM;
rag - Package for an experiment with semantic search, kNN and the mBERT model are used.

Code documentation is available at the link.

Requirements

For more information, see the file requiremets.txt.

Contacts

Our contacts:

Martsinkevich Viacheslav, slavamarcin@yandex.ru;
Tereshchenko Vladislav, vlad-tershch@yandex.ru;
Aminov Natig, natig.aminov@gmail.com.

Conferences

XIII Конгресс молодых ученых ИТМО:
- Дворников А.С., Стрижов Д.А., Унтила А.А., Федоров Д.А. ИССЛЕДОВАНИЕ СГЕНЕРИРОВАННОГО ТЕКСТА НА ПРЕДМЕТ РАСПОЗНАВАНИЯ ИЗМЕНЕНИЙ СЕРВИСАМИ ИДЕНТИФИКАЦИИ ИСКУССТВЕННОГО ИНТЕЛЛЕКТА - 2024;
- Мищенко М.Ю., Мустафин Д.Э., Унтила А.А. Оценка релевантности неструктурированных данных для анализа и дообучения LLM - 2024;
- Маракулин А.А., Дедкова А.В., Аминов Н.С., Федоров Д.А. СРАВНИТЕЛЬНЫЙ АНАЛИЗ МЕТОДОВ PEFT ДЛЯ ДООБУЧЕНИЯ БОЛЬШИХ ЯЗЫКОВЫХ МОДЕЛЕЙ - 2024;
- Богданов М.А., Никифоров М.А., Аминов Н.С., Терещенко В.В., Федоров Д.А. АНАЛИЗ БОЛЬШИХ ДОКУМЕНТОВ ПРИ ПОМОЩИ БОЛЬШИХ ЯЗЫКОВЫХ МОДЕЛЕЙ - 2024;
53 конференция ППС:
- Мустафин Д.Э., Крылов М.М., Терещенко В.В.ХРАНЕНИЕ ГЕТЕРОГЕННЫХ ДАННЫХ ДЛЯ ИХ ПОСЛЕДУЮЩЕЙ ОБРАБОТКИ - 2023;
- Богданов М.А., Терещенко В.В., Аминов Н.С.ПРЕДВАРИТЕЛЬНЫЙ АНАЛИЗ ДОКУМЕНТОВ УЧЕБНОГО ПРОЦЕССА ДЛЯ ПОСЛЕДУЮЩЕГО ИХ ТЕМАТИЧЕСКОГО МОДЕЛИРОВАНИЯ - 2023;
- Синюков Л.В., Лаптев Е.И., Терещенко В.В.ОЦЕНКА ВЛИЯНИЯ ОБРАЗОВАТЕЛЬНЫХ ДИСЦИПЛИН НА РЕЗУЛЬТАТ КУРСОВЫХ РАБОТ С ИСПОЛЬЗОВАНИЕМ ТЕМАТИЧЕСКОГО МОДЕЛИРОВАНИЯ - 2023;
- Дворников А.С., Стрижов Д.А., Аминов Н.С. РАЗРАБОТКА LLM-МОДЕЛИ КЛАССИФИКАЦИИ ТЕКСТА С ЦЕЛЬЮ АВТОМАТИЧЕСКОГО ОПРЕДЕЛЕНИЯ ДОКУМЕНТА, НАПИСАННОГО ИСКУССТВЕННЫМ ИНТЕЛЛЕКТОМ - 2023.

Authors

Tereshchenko Vladislav
Martsinkevich Viacheslav
Aminov Natig
Mischenko Maxim
Bogdanov Maxim
Dvornikov Artem
Laptev Egor
Sinyukov Lev

Name		Name	Last commit message	Last commit date
Latest commit History 219 Commits
.github/workflows		.github/workflows
alembic		alembic
docs		docs
examples		examples
src		src
tests		tests
.codecov.yml		.codecov.yml
.env-template		.env-template
.gitignore		.gitignore
.pep8speaks.yml		.pep8speaks.yml
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
alembic.ini		alembic.ini
llm_requirements.txt		llm_requirements.txt
requirements.txt		requirements.txt
start_command.txt		start_command.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Edulytica

Description

Features

Installation

1. Clone the repository

2. Activate venv

3. Install requirements

4. Start Application

5. Activate Celery

6. Run npm

7. Run Celery task

Getting started

Documentation

Requirements

Contacts

Conferences

Authors

About

Releases

Packages

Contributors 8

Languages

License

LISA-ITMO/Edulytica

Folders and files

Latest commit

History

Repository files navigation

Edulytica

Description

Features

Installation

1. Clone the repository

2. Activate venv

3. Install requirements

4. Start Application

5. Activate Celery

6. Run npm

7. Run Celery task

Getting started

Documentation

Requirements

Contacts

Conferences

Authors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 8

Languages

Packages