# Intel® VTune™ Profiler


**Intel® VTune™ Profiler** optimizes application performance, system performance, and system configuration for HPC, cloud, IoT, media, storage, and more.

**what is it in the context of the python profiler/this repo?**

- **CPU, GPU, and FPGA**: Tune the entire application’s performance―not just the accelerated portion.
- **Multilingual**: Profile SYCL*, C, C++, C#, Fortran, OpenCL™ code, Python*, Google Go* programming language, Java*, .NET, Assembly, or any combination of languages.
- **System or Application**: Get coarse-grained system data for an extended period or detailed results mapped to source code.
- **Power**: Optimize performance while avoiding power- and thermal-related throttling. <br>

**To get started with Intel VTune** visit the article **[Get Started with Intel® VTune™ Profiler](https://www.intel.com/content/www/us/en/docs/vtune-profiler/get-started-guide/2023/overview.html)** and follow the instructions. <br>

**To download the Intel VTune:**
- **[As Part of the Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html)**: Intel VTune Profiler is included in the Intel® oneAPI Base Toolkit, which is a core set of tools and libraries for developing high-performance, data-centric applications across diverse architectures. 
- **[Stand-Alone Version (recommended for this notebook)](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler-download.html)**: A stand-alone download of Intel VTune Profiler is available. You can download binaries from Intel or choose your preferred repository.<br>

**To get help about usage of Intel VTune** visit the article **[Intel® VTune™ Profiler User Guide](https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2023-0/overview.html)**

**To know more about Intel VTune** visit the article **[Intel® VTune™ Profiler](https://www.intel.com/content/www/us/en/developer/tools/oneapi/vtune-profiler.html)**

## Install and Initialize VTune

1. **[Install Intel® VTune™ Profiler on your Linux* system.](https://www.intel.com/content/www/us/en/develop/documentation/vtune-install-guide/top/linux.html)**

2. Build your application with symbol information and in Release mode with all optimizations enabled. For detailed information on compiler settings, see the **[VTune Profiler online user guide.](https://software.intel.com/en-us/vtune-amplifier-help-compiler-switches-for-performance-analysis-on-windows-targets)** <br>
You can also use the matrix sample application available in <install_directory>\sample\matrix. You can see sample results in `<install-dir>\sample (matrix)`.

3. Set up the environment variables: <br>

  `source <install-dir>/setvars.sh`
    <br>

  By default, the `install-dir` is: <br>
    
  `$HOME/intel/oneapi/` when installed with user permissions, and  <br>

  `/opt/intel/oneapi/` when installed with root permissions. <br>
     
  If you want to know how to get started with VTune and learn while using the VTune GUI visit this **[article](https://www.intel.com/content/www/us/en/docs/vtune-profiler/get-started-guide/2023/linux-os.html)**

## DL_VTune

When using VTune for profiling Python code, **DL_VTune** is a handy tool as it provides **ITT_TAGGING**, which can be used to profile blocks of code. We can choose **different starting and end points** and give names to different sections for profiling. <br>
To know more about **DL_VTune** visit its **[GitHub repo](https://github.com/intel-sandbox/DL_VTune)**

### Pre-requisites for DL_VTune

- Intel VTune Profiler installation (Use instructions above) # add anchor to instructions instead?
- TensorFlow or PyTorch installation (If you want to profile **[Intelligent-Indexing](https://github.com/oneapi-src/intelligent-indexing)** toolkit see instructions below to set up **doc_class_stock** and **doc_class_intel** environments.) <br> Need TensorFlow or PyTorch version with oneDNN ITT Tagging support. **is that required? shouldn't it be sklearnex and modin?**
Please refer **[dev_guide_profilers](https://oneapi-src.github.io/oneDNN/dev_guide_profilers.html)** for details.
- `itt-python` installation :
```
conda install -c conda-forge itt-python
```

### Install DL_VTune

After all the Pre-requisites are installed, follow the below steps
- Clone the **[DL_VTune github repo](https://github.com/intel-sandbox/DL_VTune)**

`git clone https://github.com/intel-sandbox/DL_VTune`

`cd DL_VTune`

`pip install .`

### Using DL_VTune

We will create different domains so that VTune results can be visualized easily (uncomment the cell below accordingly). #is this provided already?  what does this cell do?

**NOTE:** please change **"domain"** and **"DLTasks"** according to your needs.

In [1]:
# import itt
# # ... 
# domain = itt.domain_create("domain")
# itt.task_begin(domain, "DLTasks")
# # ... do the DLTasks...
# itt.task_end(domain)

## Example to use **VTune** for **intelligent_indexing** ref kit 

The **[Intelligent Indexing](https://github.com/oneapi-src/intelligent-indexing)** ref kit demonstrates one way of building an NLP pipeline for classifying documents to their respective topics and describe how we can leverage the **Intel® AI Analytics Toolkit (AI Kit)** to accelerate the pipeline.

**Intel® AI Analytics Toolkit (AI Kit)** is used to achieve quick results even when the data for a model are huge. It provides the capability to reuse the code present in different languages so that the hardware utilization is optimized to provide these results.

The **Intelligent Indexing** ref kit has different Intel® oneAPI optimizations enabled like:
- **[Intel® Distribution of Modin*](https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-of-modin.html#gs.v03x2l)**
The Intel® Distribution of Modin* is a performant, parallel, and distributed dataframe system that is designed around enabling data scientists to be more productive. It provides drop-in acceleration to your existing **pandas** workflows. No upfront cost to learning a new API. Integrates with the Python* ecosystem. Seamlessly scales across multicores with Ray* and Dask* clusters (run on and with what you have)
- **[Intel® Extension for Scikit-learn*](https://www.intel.com/content/www/us/en/developer/tools/oneapi/scikit-learn.html)**
Designed for data scientists, Intel® Extension for Scikit-Learn* is a seamless way to speed up your Scikit-learn applications for machine learning to solve real-world problems. This extension package dynamically patches scikit-learn estimators to use Intel® oneAPI Data Analytics Library (oneDAL) as the underlying solver, while achieving the speed up for your machine learning algorithms out-of-box.

**NOTE** Please visit the **[Intelligent Indexing](https://github.com/oneapi-src/intelligent-indexing)** Ref kit page to know more about the kit.
- Please follow the steps in github repo to clone and create the environment.
- After creating environment install **DL_VTune** in both the environments **doc_class_stock** and **doc_class_intel** using instructions above.

**We will be using **VTune** to profile this workload below.**

To know more about different types of analysis options available with **VTune** visit the **[article.](https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2023-1/running-command-line-analysis.html)** <br>
To know how to use the command line interface for **VTune** visit the **[cheatsheet.](https://www.intel.com/content/dam/develop/external/us/en/documents/vtune-profiler-cheat-sheet.pdf)**

#### Profile the Intelligent Indexing Ref Kit with Stock packages

To run **performance profiling** on #what ref kit? ref kit. 

- Navigate to `intelligent-indexing/src/`:`cd intelligent-indexing/src/`
- Create a file **run_benchmarks_vtune.py** as shown above. We have given one for your reference. 
- Copy and paste the file at location **intelligent-indexing/src/**

Run the command below in the **Terminal** to **collect a performance snapshot** using stock packages

`vtune -collect performance-snapshot -result-dir='../Profiling_Guide/Vtune_Profiler/VTune_Profiler_Results/stock_results/performance_snapshot_results/' -- python  run_benchmarks_vtune.py -l ../logs/stock_stock.log`

##### To visulaize results

Make sure that you are in the same directory as before, `intelligent-indexing/src/`. <br>
Run the below command in Terminal to visualize **VTune results**

`vtune-backend --web-port=8080 --data-directory="'../Profiling_Guide/Vtune_Profiler/VTune_Profiler_Results/stock_results/performance_snapshot_results/" --allow-remote-access`

#### Profile Intelligent Indexing Ref Kit with Intel oneAPI optimized packages

To run **performance profiling** on ref kit. 

- Navigate to `intelligent-indexing/src/` : `cd intelligent-indexing/src/`
- Create a file **run_benchmarks_vtune.py** as shown above. We have given one for your reference. 
- Copy and paste the file at location **intelligent-indexing/src/**

Run the command below in the **Terminal** to **collect a performance snapshot** using oneAPI optimized packages

`vtune -collect performance-snapshot -result-dir='../Profiling_Guide/Vtune_Profiler/VTune_Profiler_Results/oneapi_optimized_results/performance_snapshot_results/' -- python  run_benchmarks_vtune.py -i -l ../logs/intel_intel.log`

##### To visulaize results

Make sure that you are in the same directory as before, `intelligent-indexing/src/`. <br>
Run the below command in Terminal to visualize **VTune results**

`vtune-backend --web-port=8080 --data-directory="../Profiling_Guide/Vtune_Profiler/VTune_Profiler_Results/oneapi_optimized_results/performance_snapshot_results/" --allow-remote-access`

#### Results

In [None]:
# should something be here? or describe