
# Tutorial for XQueryer Training, Validation, and Testing

## Environment Setup
We recommend creating a new Python environment with the following version:
```bash
Python==3.9.19
```

## Demo Database
A demo database is provided for quick review and testing purposes.  
For training your own model, **replace the demo data** with your own database or use our open-sourced database available on [Hugging Face](https://huggingface.co/datasets/caobin/XQueryer/tree/main).

## RRUFF Data
Due to sharing restrictions, the RRUFF data cannot be included here. You can download the data directly from their [official website](https://rruff.info/). After downloading, save the patterns in a `.db` file format and use our provided code for further processing.

---

## Training the Model
To train the model, run the following command:
```bash
python train.py --batch_size 2 --epochs 1 --num_workers 0 --data_dir_train ../demo_data/demo_train.db --data_dir_val ../demo_data/demo_val.db
```

### Notes:
- Ensure you have set the correct paths for your training and validation databases.
- During training, a folder will be created to save the pre-trained models.  
  Every five epochs, a checkpoint file will be saved in this folder.

---

## Inference

### Inference for Structure
To perform inference on the structure, use the following command:
```bash
python infer.py --batch_size 1 --num_workers 0 --data_dir ../demo_data/demo_test.db --load_path ./output/2024-12-18_1338/checkpoints/checkpoint_0001.pth
```

### Inference for Crystal System
To infer the crystal system, run:
```bash
python infer2cs.py --batch_size 1 --num_workers 0 --data_dir ../demo_data/demo_test.db --load_path ./output/2024-12-18_1338/checkpoints/checkpoint_0001.pth
```

### Inference for Space Group
To infer the space group, execute:
```bash
python infer2sg.py --batch_size 1 --num_workers 0 --data_dir ../demo_data/demo_test.db --load_path ./output/2024-12-18_1338/checkpoints/checkpoint_0001.pth
```

---

### Additional Notes:
- Ensure the paths to your data and model checkpoints are correctly specified.
- Modify the `batch_size`, `num_workers`, or other parameters as needed for your specific hardware or dataset.

Enjoy using **XQueryer** for your PXRD data analysis

In [1]:
# To train the model, run the following command:

!python train.py --batch_size 2 --epochs 1 --num_workers 0 --data_dir_train ../demo_data/demo_train.db --data_dir_val ../demo_data/demo_val.db

output/2024-12-18_1338>>>>  Running on cpu  <<<<
Xmodel(
  (conv): ConvModule(
    (conv1): Conv1d(1, 32, kernel_size=(17,), stride=(1,), padding=(8,))
    (bn1): BatchNorm1d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act1): ReLU()
    (conv2): Conv1d(1, 32, kernel_size=(33,), stride=(1,), padding=(16,))
    (bn2): BatchNorm1d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act2): ReLU()
    (conv3): Conv1d(1, 32, kernel_size=(65,), stride=(1,), padding=(32,))
    (bn3): BatchNorm1d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act3): ReLU()
    (conv4): Conv1d(1, 32, kernel_size=(129,), stride=(1,), padding=(64,))
    (bn4): BatchNorm1d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act4): ReLU()
    (conv5): Conv1d(1, 32, kernel_size=(257,), stride=(1,), padding=(128,))
    (bn5): BatchNorm1d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act5): ReLU()

In [6]:
# To perform inference on the structure, use the following command:

!python infer.py --batch_size 1 --num_workers 0 --data_dir ../demo_data/demo_test.db --load_path ./output/2024-12-18_1338/checkpoints/checkpoint_0001.pth

Loaded model from ./output/2024-12-18_1338/checkpoints/checkpoint_0001.pth
Loaded data from: ['../demo_data/demo_test.db']
Evaluating... : 100%|█████████████████████████| 10/10 [00:20<00:00,  2.05s/data]
Validation Loss:  11.481029987335205
Validation Accuracy:  0.0
Accuracy: 0.0%  (0/10)
Precision: 0.0%
Recall: 0.0%
F1 Score: 0.0%
THE END


In [7]:
# To infer the crystal system, run:

!python infer2cs.py --batch_size 1 --num_workers 0 --data_dir ../demo_data/demo_test.db --load_path ./output/2024-12-18_1338/checkpoints/checkpoint_0001.pth

Loaded model from ./output/2024-12-18_1338/checkpoints/checkpoint_0001.pth
Loaded data from: ['../demo_data/demo_test.db']
Evaluating... : 100%|█████████████████████████| 10/10 [00:18<00:00,  1.87s/data]
Validation Loss:  11.481029033660889
Validation Accuracy:  0.0
Accuracy: 0.0%  (0/10)
Precision: 0.0%
Recall: 0.0%
F1 Score: 0.0%
THE END


In [8]:
# To infer the space group, execute:

!python infer2sg.py --batch_size 1 --num_workers 0 --data_dir ../demo_data/demo_test.db --load_path ./output/2024-12-18_1338/checkpoints/checkpoint_0001.pth

Loaded model from ./output/2024-12-18_1338/checkpoints/checkpoint_0001.pth
Loaded data from: ['../demo_data/demo_test.db']
Evaluating... : 100%|█████████████████████████| 10/10 [00:21<00:00,  2.20s/data]
Validation Loss:  11.481027317047118
Validation Accuracy:  0.0
Accuracy: 0.0%  (0/10)
Precision: 0.0%
Recall: 0.0%
F1 Score: 0.0%
THE END
