Skip to content

GSoC'24 Project Ideas List

Shahzaib Kashif edited this page Mar 16, 2024 · 13 revisions

Welcome to the project ideas list for GSoC'24. You are allowed to choose any given project to work on with us or you can suggest your own idea as well.

We encourage you to join our Gitter Chat Room (general) to discuss ideas. Please use the recommended Proposal Format for all projects and you can see the detailed timeline here.

Project 1: Quad-SPI Flash Controller for Azadi-SoC

Difficulty: Medium

Designation: 175 hours (medium)

Required Skills & Tools: SystemVerilog, SPI and QSPI protocol, Memory interfacing, Verilator.

Mentors: Sajjad Ahmed, Zeeshan Rafique

Chat Channel: Join Azadi-SoC Gitter Channel

Overview:

Flash memories are widely used as off-chip storage for embedded processors. For an embedded processor to interact with different flash memory devices requires a well-structured flash memory controller which can communicate with the flash memory of different vendors. The project is aimed at designing a quad-spi flash memory controller for Azadi SoC (a 32bit SoC based on RISC-V ISA). The controller should have flexible configuration support to target various flash devices e.g, micron, spansion and winbond.

Block diagram:

image

Expected Outcomes:

  • Single and Quad SPI interface modes.
  • All Four operating modes (00, 01, 10, 11) of CPOL and CPHA.
  • Flexible configuration support for targeting different flash memories.
  • Documentation of the work done

Project 2: OpenTCAM - SRAM-based TCAM compiler (phase-2)

Difficulty: Hard

Designation: 350 hours (large)

Required Skills & Tools: Python, Verilog, OpenRAM Compiler, Verilator.

Mentors: Ali Ahmed, Sajjad Ahmed

Chat Channel: Join OpenTCAM Gitter Channel

Overview:

OpenTCAM is an open-source Python framework that can be used to create the design (RTL) and layouts (GDS-II) of a customizable SRAM-based TCAM memory to use in FPGA and ASIC designs. Currently, the compilers are using SRAMs generated from OpenRAM Compiler, but the idea is to make a generalized compiler for any SRAM-based TCAM. The idea is to utilize 36KB BRAM blocks of FPGAs and OpenRAM generated 1Kb SRAM blocks (using sky130 nm PDKs) for ASIC to mimic any size of TCAM.

The classical TCAM is really expensive, acquires a large area(16T cell) and has high power consumption, and is not openly available. Unlike SRAM which is cheap, power efficiency requires less area(6T cell) and has a mass production ratio.

OpenTCAM RTL-Gen is the extension of a broader TCAM compiler developed in GSoC’22. Which has TCAM to SRAM table mapping facility, but RTL generation support is very limited. This project is all about adding a wide range of configurable RTL generation to the existing compiler.

Table Mapping:

image image

Block Diagram:

image

Expected Outcomes:

  • Python-based TCAM compiler which can generate SystemVerilog RTL with different configurations of TCAM blocks for Single cycle search.
  • Support for pipelined TCAM structure for multicycle search operations.
  • Documentation of the work done

Reference: Ali Ahmed, Kyungbae Park. “Resource-Efficient SRAM-Based Ternary Content Addressable Memory”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems ( Volume: 25, Issue: 4, April 2017)

Project 3: ASIC Design of OpenTCAM Generated IPs using OpenLane and GF180/SKY130

Difficulty: Medium

Designation: 175 hours (medium)

Preferred Skills: Understanding of Digital VLSI Designs using OpenLane (Synthesis, Floorplan, Place and Route) Tools: OpenLane, OpenTCAM PDKs: GF180, SKY130

Mentors: Ali Ahmed, Sajjad Ahmed

Chat Channel: Join OpenTCAM Gitter Channel

Overview:

OpenTCAM is an open source python based compiler which generates SRAM based TCAM IPs using open source SRAM blocks (OpenRAM). This project is about generating layouts/GDS of IPs generated with OpenTCAM optimized for Power, Performance and Area.The projects focuses on using not only SRAM IPs for TCAM layout but to also experiment the wide range of TCAM configurations based on flip flop based memories which would provide the results for maximum utilization of SRAM for TCAM emulation.

Expected Outcomes:

  • Framework for generating optimized GDS of TCAM IPs using OpenLane.
  • GDS of various TCAM configurations discussed during the project.
  • Statistical results of various design iterations for comparative study of optimized TCAM configuration.
  • Documentation for complete project.

Project 4: Baby Kyber Accelerator

Difficulty: Hard

Designation: 350 hours (large)

Required Skills: CHISEL HDL, Post Quantum Computing (PQC), Algebra

Mentors: Farhan Ahmed Karim, Shahzaib Kashif

Chat Channel: Join Baby Kyber Accelerator Gitter Channel

Overview:

Baby Kyber, a simplified version of the Kyber encryption system, employs polynomial arithmetic with specific moduli for computational efficiency. Key generation involves creating private keys with small polynomials and public keys with random polynomials. Moreover, encryption transforms binary messages into polynomials, scales them, and encrypts them using the public key, while decryption reverses this process to recover the original message, leveraging the hardness of the module-learning-with-errors (MLWE) problem for security.

The Baby Kyber Accelerator project aims to develop a dedicated hardware accelerator accelerating Baby Kyber Encryption operations on hardware level. It will efficiently implement the Baby Kyber key exchange algorithm while integrating random number generation. Additionally, it will be integrated with an existing matrix multiplication accelerator to enhance computational power. This initiative seeks to produce a robust cryptographic solution customized to be attachable and working within the SoC-Now Framework and Azadi SoC, prioritizing security and efficiency.

Expected Outcomes:

  • A dedicated CHISEL-based hardware accelerator with Generic Interfaces, specifically tailored for the Baby Kyber key exchange algorithm.
  • Workable prototype attached with SoC-Now and Azadi SoC
  • Complete documentation of the project.

Project 5: Spike ISS Backend for Oxygen Simulator

Difficulty: Medium

Designation: 175 hours (medium)

Required Skills: RISC-V ISA, linux, Spike ISS, Python, Django

Mentors: Farhan Ahmed Karim, Shahzaib Kashif

Chat Channel: Join Oxygen Simulator Gitter Channel

Overview:

The project aims to introduce a Spike ISS-based backend option for the Oxygen RISC-V ISS Simulator. Currently, Oxygen provides a versatile environment for simulating RISC-V programs, similar to other simulators like Venus. However, integrating Spike ISS as a backend will enable users to simulate their programs on Oxygen and observe real-time register values and memory contents generated by the Spike ISS.

Expected Outcomes:

  • Develop a separate python backend integration of Spike ISS as a backend for the Oxygen simulator.
  • Implement a user-friendly interface in Oxygen to visualize real-time register values and memory contents generated by Spike ISS during simulation.
  • Provide features for easy debugging and analysis of programs running on the Spike ISS backend.
  • Ensure compatibility with various RISC-V extensions supported by Spike ISS.
  • Complete documentation of the project.

Project 6: Vaquita Vector Coprocessor

Difficulty: Hard

Designation: 350 hours (large)

Required Skills: RISC-V ISA, Chisel, Computer Architecture, Vector Processing, RISC-V Vector Extension Fundamentals

Mentors: Farhan Ahmed Karim, Shahzaib Kashif

Chat Channel: Join NucleusRV Gitter Channel

Overview:

This project aims to expand the capabilities of NucleusRV Core by attaching the RISC-V Vector Extension (RVV v1.0) in form of a coprocessor. This initiative will concentrate on the early-stage development of the basic functionalities of RISC-V vector extension, specifically targeting the implementation of

  1. Configuration Setting Instructions.
  2. Vector Arithmetic Instructions.
  3. Vector Unit Stride Load/Store Operations with an emphasis on lmul=1 and lmul > 1.

Selected candidate will be expected to collaborate closely with the project maintainers, to finalize the coprocessor's design and architecture, followed by the development and integration of RTL designs tailored to these specifications. The integration of Vaquita Coprocessor with NucleusRV is expected to provide seamless operation within the NucleusRV ecosystem and improve performance for vector-intensive computations.

Expected Outcomes:

  • Develop a Architectural Plan for Vector implementation in a Coprocessor, focusing on initial support for essential vector operations.
  • RTL implementation for coprocessor with NucleusRV compatible attachable interfaces and shall specifically address configuration settings, vector arithmetic, and unit stride load/store instructions supporting lmul=1 and lmul > 1.
  • Complete documentation of the project.

Project 7: Automated Hardware Design and Compiler Generation using MLIR and CIRCT

Difficulty: Hard

Designation: 350 hours (large)

Required Skills: Compiler design, LLVM (MLIR), CIRCT

Mentors: Farhan Ahmed Karim, Shahzaib Kashif

Chat Channel: Join Merledu LLVM MLIR Gitter Channel

Overview:

Every new hardware that is designed requires its own compiler to be created. Creating a compiler is a time-intensive task on top of hardware design. Compiler technologies have made great strides with the advent of MLIR and CIRCT. The goal of this project is to automate hardware design along with building its compiler by targeting the multi-level compiling capabilities of MLIR (Multi-Level Intermediate Representation) and CIRCT.

Expected Outcomes:

  • Compiler generation using MLIR and CIRCT
  • Documentation and demonstration of the automated workflow

Project 8: Attach Magma-Si with SoC-Now

Difficulty: Medium

Designation: 175 hours (medium)

Required Skills: CHISEL HDL, Computer Architecture, Generic Matrix Multiplication (GeMM), Accelerators, System-on-Chip

Mentors: Farhan Ahmed Karim, Shahzaib Kashif

Chat Channel: Join Magma-Si Gitter Channel

Overview:

Magma-Si (Matrix Accelerator Generator for GeMM Operations based on SIGMA Architecture) is an GeMM Accelerator Generator designed in CHISEL HDL. It is based on SIGMA Architecture which makes the Magma-Si generated GeMM Accelerator comparatively faster than the traditional systolic array based GeMM Accelerators.

This project aims to attach the Magma-Si generated accelerators with the SoC-Now Framework, So that SoCs can be generated with SoC-Now with the capability of multiplying matrices more efficiently via Magma-Si generated GeMM Accelerators.

Expected Outcomes:

  • SoC-Now compatible generic interfaces working with Magma-Si.
  • SoC being generated from SoC-Now with Magma-Si Accelerator attached with it.
  • SoC + GeMM Accelerator shall be emulating successfully on FPGA.
  • Complete documentation of the project.

Project 9: Dynamic Implementation of Magma-Si Components

Difficulty: Hard

Designation: 350 hours (large)

Required Skills: CHISEL HDL, Computer Architecture, Generic Matrix Multiplication (GeMM), Accelerators, System-on-Chip, SIGMA Architecture

Mentors: Farhan Ahmed Karim, Shahzaib Kashif

Chat Channel: Join Magma-Si Gitter Channel

Overview:

Magma-Si (Matrix Accelerator Generator for GeMM Operations based on SIGMA Architecture) is an GeMM Accelerator Generator designed in CHISEL HDL. It is based on SIGMA Architecture which makes the Magma-Si generated GeMM Accelerator comparatively faster than the traditional systolic array based GeMM Accelerators.

This project aims to implement the components of Magma-Si such as Forward Adder Network (FAN) and Benes Distribution Network, to make it able to generate accelerators with a wider range of matrix multiplication capability in terms of number of rows and columns. Currently, Magma-Si supports accelerator generation for matrices having a maximum number of rows being 2 and columns also being 2, In order to make this number go up, there is a need to introduce more dynamicity in the Networks.

The applicant shall go through the SIGMA Architecture in depth to better undertand the underlying algorithm behind the Benes and FAN Networks, so that they can be implemented with dynamic nature.

Expected Outcomes:

  • Accelerators being generated with max of 1024 rows and 1024 columns.
  • Benes and FAN being dynamic that for each given parameter (no. of rows and cols) the component shall re-use its components and generate the correct hardware following the algorithm
  • Verification testbenches (randomized via Cocotb)
  • Complete documentation of the project.