This project focuses on the topic of Facial Landmark Localization – identifying key characteristic points on the human face. We conduct a survey of previous research and delve into the architecture of FaceXFormer – a model that utilizes the Transformer architecture for this task.
In addition to analyzing the original model, we also pre-train it on a smaller dataset and apply the autocast technique to reduce computational costs.
The main objective of this project is to understand the structure of FaceXFormer and provide an overview of research in this field.
Short demo our application: video
- 🔹 Python
- 🔹 PyTorch
- 🔹 Jupyter Notebook
- 🔹 Kaggle
- 🔹 Streamlit
facial-landmark-localization/
├── Evaluate/ # Model evaluation notebooks
├── Papers/ # Research documents
├── Related-works/ # Collection of related works
├── Source/ # Main source code
├── Surveys/ # Methodology overviews
├── planning.md # Implementation plan
└── README.md # Documentation
git clone https://github.com/nguyenvmthien/facial-landmark-localization.git
cd facial-landmark-localization
python -m venv venv
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windows
pip install -r requirements.txt
Open Jupyter Notebook and run the .ipynb
files in the Evaluate/
directory to view the model evaluation results.
Detailed results from training and re-evaluation of the FaceXFormer model are presented in the notebooks within the Evaluate/
folder.
Experiments focus on the model's learning ability with smaller datasets and the impact of the autocast technique on performance..
This project is released under the MIT License.