Text2Trait

Text2Trait is a project that combines a user-friendly frontend application with a backend algorithm powered by the [LasUIE tool (https://github.com/ChocoWu/LasUIE/tree/master). Every part of the application requires different libraries, hence every folder with a certain utility contains a requirements.txt/pyproject.toml file allowing you to download appropiate versions of the dependencies.

📂 Repository Structure

Frontend Application
Backend Algorithm (LasUIE-based)
Utility Scripts
Dataset

🚀 Frontend Application

The frontend is relatively easy to use and designed for quick setup.

▶️ Getting Started

Install all dependencies listed in the pyproject.toml file.
Locate and run the app.py script:
```
python app.py
```
After running the script, your terminal will display a message similar to:
```
Running on http://127.0.0.1:5000/
```
Open the displayed link in your browser — the application should load immediately. Every part of the code in this section is well commented and described. If you have any doubts how certain method work, you can find it's description just under the method definition.

🚀 Backend Application

This backend leverages the LasUIE model and is centered around three key files:

run_finetune.py
run_inference.py
config.json

run_finetune.py This script fine-tunes a selected backbone model from popular GLMs such as T5, BERT, or Flan-T5.
- A wide range of hyperparameters can be configured.
- Due to limitations of the original implementation, several updates were made to align with the LasUIE workflow.
- All modifications are clearly marked in the code for easy reference.
- Additionally, utils.py (inside the engine folder) has been updated with similar improvements.
run_inference.py Designed for straightforward usage:
- Set your desired hyperparameters in the file.
- Ensure the correct directory is selected.
- Run the script.
- ⚠️ Note: When using a trained model from the checkpoint folder, make sure to update the model name in the file to match the one in the folder.
config.json A configuration file containing general parameters that influence both training and inference, such as:
- Backbone model type
- Learning rate
- Other key settings

🛠️ Utility Scripts

This section provides a collection of lightweight, well-documented scripts to streamline data preparation for training. Each script is clearly named and does exactly what it promises. You can use them to:

Convert PDF data into .txt and .json formats
Transform Excel data into the required JSON training format
Split datasets into train, validation (dev), and test sets
Transfom inference data into a knowledge graph that is used in the application to visualize results

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
dataset		dataset
text2trait_backend/LasUIE		text2trait_backend/LasUIE
text2trait_forntend_app/src		text2trait_forntend_app/src
utils		utils
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Text2Trait

📂 Repository Structure

🚀 Frontend Application

▶️ Getting Started

🚀 Backend Application

🛠️ Utility Scripts

About

Uh oh!

Releases

Packages

Languages

IntegrativeBioinformaticsLab/Text2Trait

Folders and files

Latest commit

History

Repository files navigation

Text2Trait

📂 Repository Structure

🚀 Frontend Application

▶️ Getting Started

🚀 Backend Application

🛠️ Utility Scripts

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages