# Tabular Data Training using AutoTrain Advanced

In this notebook, we will train a model on tabular data using AutoTrain Advanced.
You can replace the model with any supported tabular model and dataset with any other dataset in proper formatting.
For dataset formatting, please take a look at [docs](https://huggingface.co/docs/autotrain/index).

In [1]:
from autotrain.trainers.tabular.params import TabularParams
from autotrain.project import AutoTrainProject

In [2]:
HF_USERNAME = "your_huggingface_username"
HF_TOKEN = "your_huggingface_write_token" # get it from https://huggingface.co/settings/token
# It is recommended to use secrets or environment variables to store your HF_TOKEN
# your token is required if push_to_hub is set to True or if you are accessing a gated model/dataset

In [5]:
params = TabularParams(
    model="xgboost", # can be xgboost, lightgbm, catboost, randomforest, etc.
    data_path="your_tabular_dataset", # path to the dataset on huggingface hub or local path
    target_columns=["target"], # the column(s) in the dataset that contain the target values
    id_column="id", # the column that contains unique identifiers (optional)
    train_split="train",
    valid_split="validation",
    task="classification", # can be "classification" or "regression"
    num_trials=10, # number of hyperparameter optimization trials
    time_limit=600, # time limit in seconds
    project_name="autotrain-tabular",
    push_to_hub=True,
    username=HF_USERNAME,
    token=HF_TOKEN,
)
# tip: you can use `?TabularParams` to see the full list of allowed parameters

If your dataset is in CSV format and is stored locally, make the following changes to `params`:

```python
params = TabularParams(
    data_path="data/", # this is the path to folder where train.csv is located
    target_columns=["target"], # the column name(s) in the CSV file which contains the target
    categorical_columns=["cat_col1", "cat_col2"], # list of categorical columns (optional)
    numerical_columns=["num_col1", "num_col2"], # list of numerical columns (optional)
    train_split = "train" # this is the filename without extension
    valid_split = "valid" # this is the filename without extension
    .
    .
    .
)
```

In [None]:
# this will train the model locally
project = AutoTrainProject(params=params, backend="local", process=True)
project.create()