Skip to content

redR0b0t/mh_one_api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mh_one_api


MachineHack | Intel® oneAPI Hackathon 2023 -

image



Open in GitHub Codespaces

About oneAPI:

oneAPI is an open, cross-industry, standards-based, unified, multi-architecture, multi-vendor programming model that delivers a common developer experience across accelerator architectures – for faster application performance, more productivity, and greater innovation. The oneAPI initiative encourages collaboration on the oneAPI specification and compatible oneAPI implementations across the ecosystem.

About Intel Development Cloud

Intel® Developer Cloud offers several configurations that are tuned to various workloads. From AI and inference training to FPGA development to edge prototyping and preproduction deployment, you can use the environment that best matches your business needs.

  • Features of Intel Dev Cloud :
    Learn with hands-on tutorials.
    Experiment with real-world code samples.
    Evaluate performance and acceleration with multiple hardware configurations.
    Build heterogeneous applications.
    Develop your own prototype.
    Benchmark your own AI workloads with always-on access to the latest AI hardware.

Problem statement

While text-based tasks are present everywhere, one of the most compelling objectives is the development of a question-answering system tailored to textual data. Imagine a system capable of sifting through vast datasets, identifying 'span_start' and 'span_end' positions within the 'Story' text, extracting the relevant 'span_text,' and generating responses that align perfectly with the provided 'Answer' for each question.

Detailed Architecture Flow:

Screenshot 2023-10-25 161033

Technology Stack:

image

Step-by-Step Code Execution Instructions:

  • Clone the Repository
 $ git clone https://github.com/redR0b0t/mh_one_api
 $ cd mh_one_api
  • Train/Fine-tune the flan-t5-xl model on intel dev cloud.
#!/bin/bash

echo "----------checking if gpu available on current job-----------------"
# setting oneapi env and checking gpu
conda init bash
echo "-------------------------------------------"
groups  # Key group is render, PVC access is unavailable if you do not have render group present.
source /opt/intel/oneapi/setvars.sh --force
sycl-ls
export num_gpu="$(sycl-ls |grep "GPU" |wc -l)"
echo "num_gpu=$num_gpu\n"
export num_cpu="$(sycl-ls |grep "Xeon" |wc -l)"
echo "num_cpu=$num_cpu\n"
if [ $num_gpu == 0 && $num_cpu == 1] 
then 
    echo "---GPU not available exiting--------"
    scancel $SLURM_JOB_ID
fi 
echo "-------------------------------------------"



echo "starting fine tuning model"
cd "/home/u131168/mh_one_api/model/ft_model_pp/itp" # modify the directory path to the location of the repo on system
pip install -r "requirements.txt"

# To use ccl as the distributed backend in distributed training on CPU requires to install below requirement.
python -m pip install oneccl_bind_pt -f https://developer.intel.com/ipex-whl-stable-cpu

#installing intel extension for pytorch for GPU
python -m pip install torch==2.0.1a0 intel_extension_for_pytorch==2.0.110+xpu -f https://developer.intel.com/ipex-whl-stable-xpu

# installing intel extension for transformers
pip install intel-extension-for-transformers


export train_file="/home/u131168/mh_one_api/data/f_traind_v1.csv" # modify the directory path to the location of the repo on system

export model_path="google/flan-t5-xl"

export checkpoint_dir="/home/u131168/mh_one_api/model/ft_models/flan-t5-xl_peft_ft_v2/"
export checkpoint_name=$(ls $checkpoint_dir | grep checkpoint | tail -2 | head -n 1)
export checkpoint_path="$checkpoint_dir$checkpoint_name"
echo $checkpoint_path

# export output_dir="/home/u131168/mh_one_api/model/ft_models/flan-t5-xl_peft_ft_v1"
export output_dir="$checkpoint_dir"


python finetune_seq2seq.py \
        --model_name_or_path $model_path \
        --resume_from_checkpoint $checkpoint_path \
        --bf16 True \
        --train_file $train_file \
        --per_device_train_batch_size 2 \
        --per_device_eval_batch_size 2 \
        --gradient_accumulation_steps 1 \
        --do_train \
        --learning_rate 1.0e-5 \
        --warmup_ratio 0.03 \
        --weight_decay 0.0 \
        --num_train_epochs 1 \
        --logging_steps 10 \
        --save_steps 100 \
        --save_total_limit 2 \
        --overwrite_output_dir \
        --output_dir $output_dir \
        --peft lora

echo "finished fine tuning model"

  • Perform inference on the test dataset with finetuned flan-t5-xl-peft model on intel dev cloud
#!/bin/bash

echo "----------checking if gpu available on current job-----------------"
# setting oneapi env and checking gpu
echo "-------------------------------------------"
groups  # Key group is render, PVC access is unavailable if you do not have render group present.
source /opt/intel/oneapi/setvars.sh --force
sycl-ls
export num_gpu="$(sycl-ls |grep "GPU" |wc -l)"
echo "num_gpu=$num_gpu\n"
export num_cpu="$(sycl-ls |grep "Xeon" |wc -l)"
echo "num_cpu=$num_cpu\n"
if [ $num_gpu == 0 && $num_cpu == 1] 
then 
    echo "---GPU not available exiting--------"
    scancel $SLURM_JOB_ID
fi 
echo "-------------------------------------------"



echo "staring prediction"

#installing intel extension for pytorch for GPU
python -m pip install torch==2.0.1a0 torchvision==0.15.2a0 intel_extension_for_pytorch==2.0.110+xpu -f https://developer.intel.com/ipex-whl-stable-xpu

# installing intel extension for transformers
pip install intel-extension-for-transformers

pip install peft


python /home/u131168/mh_one_api/model/p_custom_pp/p_custom.py # modify the directory path to the location of the repo on system

echo "finished precition"




Run chatbot application to interact with the finteuned flan-t5-xl-peft model

  • Run python app to serve predictions to the frontend.
  • (*the webapp wont work ,if the python app is not running)
 
cd python_api
pip install -r ./reqs.txt
python ./ft5_cf.py

Screenshot (32)

Screenshot (33)

Run frontend application(webapp) to ask questions from the model.

  • (*make sure that the python application is running before using the webapp)

  • option 1: use the web app hosted on huggingface spaces:

https://huggingface.co/spaces/blur0b0t/mh_one_api

Screenshot (23)

  • option 2: use the prebuild files
cd mhi_pred_app/build/web
# run index.html file from browser to access the webapp
  • option 3: build app from flutter sdk (*flutter sdk need to be installed on the system)
cd mhi_pred_app
flutter run -d web-server --host=0.0.0.0

img

inference metrics for finetuned model with and without intel optimization:

  • by using intel one api optimization the inference time reduced by ~ 12.47 %

Screenshot (35)



About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published