Skip to content

The objective of this assignment is to calculate how many triangles there are in a graph using C language along with multithreading techniques for better performance

Notifications You must be signed in to change notification settings

Billkyriaf/pds_assignment_1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Parallel and Distributed Systems Assignment 1

Aristotle University of Thessaloniki

School of Electrical & Computer Engineering

Contributors: Kyriafinis Vasilis, Koro Erika
Winter Semester 2021 - 2022



Table of Contents
  1. About this project
  2. Getting started
  3. Dependencies
  4. Project directory layout
  5. Complile and run
  6. Graphs
  7. Useful links

About this project

The objective of this assignment is to calculate the number of triangles in a given graph. The graphs given are large graphs with unweighted edges. Their matrices are sparse and very large to the point they don't fit in memory. To overcome this problem we use the CSR (Compressed Sparse Row) representation. The graphs are provided in matrix market format and their CSR vectors must be computed. After that, we square the initial sparse matrix and from the computed matrix we keep only the elements in the places that are non zero in the initial matrix.

The first part of the project is to calculate the number of triangles with serial programming and confirm that the proccess works. Afterwards, the workload must be computed using parallel programming for optimal performance using pthreads, openMP and Cilk.

Getting started

To setup this repository on your local machine run the following commands on the terminal:

git clone https://github.com/Billkyriaf/pds_assignment_1.git

Or alternatively download and extract the zip file of the repository

Dependencies

1. Make or Cmake

This project uses make utilities to build and run the excecutables. Alternatively you can use Cmake but openCilk is not supported in the build script.

2. OpenMP

Make sure you have an OpenMP capable compiler

3. OpenCilk-9.0.1

You can install OpenCilk for linux operating systems by running the following commands (some commands may require sudo):

  1. Download the compressed file from GitHub
    wget https://github.com/OpenCilk/opencilk-project/releases/download/opencilk%2Fbeta3/OpenCilk-9.0.1-Linux.tar.gz
    tar xvzf OpenCilk-9.0.1-Linux.tar.gz
    mv OpenCilk-9.0.1-Linux/ /usr/local/

    t is recommended to use the directory specified above otherwise you will have to edit the Makefile.

  2. Update the permissions
    chmod og+xr /usr/local/OpenCilk-9.0.1-Linux/ -R
  3. (Optional) Remove the compressed file
    rm -f OpenCilk-9.0.1-Linux.tar.gz

4. ANSI C library for Matrix Market I/O

The ANSI C library for Matrix Market I/O library is used for reading matrix market files that contain the graphs. The project already contains the files requiered for compiling and linking the files in the ext_libraries directory. The Matrix Market standard can be found here. In general coordinate pattern symmetric matrices are used. If the graphs represented in those matrices are Unweighted the weight for each edge is concidered as 1.


Project directory layout

Top-level

.
├─ Graphs                # Graph Data are stored here (.mtx .txt files)
├─ PythonDiagrams        # Pycharm Project to create plots
├─ TriangleCalculator    # Main Project
└─ README.md

TriangleCalculator

.
├── ...
├── TrangleCalculator
|   ├── .idea                # Idea folder for Clion project
|   ├── Julia                # Julia Code for the serial implementation of the algorithm
|   ├── src
|   |   ├── OpenCilk         # OpenCilk related source files
|   |   ├── OpenMP           # OpenMP related source files
|   |   ├── Pthreads         # Pthread related source files
|   |   ├── Serial           # Serial algorithm source files
|   |   ├── ext_libraries    # Extrernal dependencies source files
|   |   ├── libraries        # Utility source files
|   |   ├── opencilk_main.c 
|   |   ├── openmp_main.c
|   |   ├── pthread_main.c
|   |   └── serial_main.c
|   ├── CMakeLists.txt       # Cmake build script for Clion
|   └── Makefile             # Makefile for linux base distros 
|
└─ ...


Compile and run

Linux

Simply run make all in the TriangleCalculator directory. The executable files will be created in the linux_build directory along with the object files. To run an executable cd in the linux_build directory and run the command ./executable_name arg1 arg2. Run make clean to clear all the object and executable files. For more information on the command line arguments read bellow

Windows (OpenCilk not supported)

The code was developed on windows and specificaly in CLion. The easiest way to run the source code is to import the project to a Clion project. The CMakeLists.txt provides targets for executing every module of the program. The only extra step required is to create the configurations and add the command line arguments.


Command line argumnets (Both Linux and Windows)

  • Serial executable
    argv[1] = 'Path to the .mtx file of the praph'
  • Pthread executable
    argv[1] = 'Path to the .mtx file of the praph'
    argv[2] = number of threads to spawn and divide the work (be reasonable max number of threads is not specified)
  • Serial executable
    argv[1] = 'Path to the .mtx file of the praph'
    argv[2] = number of threads to request from the OpenMP API
  • Serial executable
    argv[1] = 'Path to the .mtx file of the praph'
    argv[2] = number of threads to request from the OpenCilk API

Note: Relative file paths are not supported on Windows.

Graphs

The input of the program is a graph.mtx file. This is a matrix market file that contains the relations between the nodes oth the graphs. IMPORTANT note: All the graphs must be unweighted and undirected. This is crusial because only then the Sparse matrix of the graph is symmetrical. The code provided in the libraries directory takes this assumption for granted and does not check if it is true. During the development phase the graphs bellow were used to test the code, along with some small graphs created for debugging purposes. All the files can be found in the Graphs directory.


The proceeding table showcases the graphs and creates a reference performance. (The times displayed were given with the assignment)

Name Excecution time
DIMACS10: Belgiumosm 2.8s
SNAP: com-Youtube 6.5s
Mycielski: mycielskian13 0.36s
LAW 0.38s
DIMACS10: NACA0015 2.46s

Note: The excecution time for the above graphs is measured on a two core laptop.

Along with the graph.mtx for most of the graphs graph.csr files are provided. Those files were created for debbuging purposes and store the Sparse matrices ins CSR format together with information regarding the size of the matrix and the number of non zero elements. Fianly the text files in the directory store performance data for executions with all the implementations (Serial and Parallel) for every graph.

Useful links

About

The objective of this assignment is to calculate how many triangles there are in a graph using C language along with multithreading techniques for better performance

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published