## About this Part

Congrats!
You have reached the last Part of this Sprint.
In this Part, you will put what you learned during this and the previous Sprints into practice.
As the final assignment of this Sprint, you will build a Python library for doing data transformations.

P.S. we don't expect this project to be perfect - you will continue to improve your skills and there will be many projects for you to apply your newly gained skills in the future.
For now, just use what you have learned and try your best!

*Note:* [advice on building your portfolio](https://turingcollege.atlassian.net/wiki/spaces/DLG/pages/1002307695/Portfolio+Items)

## Context

You are working as a data engineer in a large software company.
You are building a POC of a Python library that should help the data scientists in your company to do data transformations.

### Functions

For the initial sprint, you have committed to building three functions that are common for transforming data used in machine learning models.

#### Transpose

Transpose switches the axes of a tensor.
It is an extremely common operation in data science workflows.
If you haven't encountered transpose before, Numpy has a great explanation: [numpy.transpose](https://numpy.org/doc/stable/reference/generated/numpy.transpose.html).
Your task is to build a transpose function that works on matrices (2d tensors) and has this signature: `transpose2d(input_matrix: list[list[float]]) -> list`, where `input_matrix` is a list of lists of real numbers.

Implement this function using only Python and its standard library.

#### Time Series Windowing

Time series windows are important for a lot of time series analysis and modeling tasks.
You can read about them [here](https://www.mathworks.com/help/econ/rolling-window-estimation-of-state-space-models.html).
Your task is to build a function that has this signature: `window1d(input_array: list | np.ndarray, size: int, shift: int = 1, stride: int = 1) -> list[list | np.ndarray]`, where `input_array` is a list or 1D Numpy array of real numbers, `size` is a positive integer that determines the size (length) of the window, `shfit` is a positive integer that determines the shift (step size) between different windows, and `stride` is a positive integer that determines the stride (step size) within each window. Your function should return a list of lists or 1D Numpy arrays of real numbers.
If you need help understanding the parameters of this function, [tf.data.Dataset.window](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#window) might help.

Implement this function using only Python, its standard library, and Numpy.

#### Cross-Correlation

Convolutional neural networks are very popular in deep learning.
Surprisingly, they rely on the cross-correlation function and not the convolution function.
One of the reasons for that is that these functions are similar enough that most texts call them convolutions.
You can read about the cross-correlation and convolution functions [here](https://d2l.ai/chapter_convolutional-neural-networks/conv-layer.html) and [here](https://www.deeplearningbook.org/contents/convnets.html).
Your task is to build a function that has this signature: `convolution2d(input_matrix: np.ndarray, kernel: np.ndarray, stride : int = 1) -> np.ndarray`, where `input_matrix` is a 2D Numpy array of real numbers, `kernel` is a 2D Numpy array of real numbers, and stride is an integer that is greater than 0. Your function should return a 2D Numpy array of real numbers.
If you need help understanding the parameters of this function, [torch.nn.functional.conv2d](https://pytorch.org/docs/stable/generated/torch.nn.functional.conv2d.html#torch.nn.functional.conv2d) might help.

Implement this function using only Python, its standard library, and Numpy.

### Library

The second part of the job is to publish these three functions as a package in [PyPI](https://pypi.org/).
Real Python has a [great article](https://realpython.com/pypi-publish-python-package/) about publishing your own packages.
Because your entire team is using Poetry for dependency management, for this task you will also have to it.

## Objectives for this Part

- Practice using Python and Numpy.
- Practice building Python libraries.
- Practice using Poetry.
- Practice publishing your packages to PyPI.

## Requirements

- Implement the three data transformation functions described in the Context section.
- Build a Python library containing the three functions.
- Publish the library to PyPI.
- Provide suggestions about how your analysis can be improved.

## Evaluation Criteria

- Adherence to the requirements. How well did you meet the requirements?
- Code quality. Was your code well-structured? Did you use the appropriate levels of abstraction? Did you remove commented-out and unused code? Did you adhere to the PEP8?
- Code performance. Did you use suitable algorithms and data structures to solve the problems?
- Presentation quality. Coherence of the presentation of the project, and how well everything is explained.
- General understanding of the topic.

## Project Review

During your project review, you should present it as if talking to a data scientist building the machine learning model in your team.  
You can assume that they will have strong data science and decent software engineering skills - they will understand technical jargon but are not expected to notice things that could have been done better or ask about the choices you've made.
They are well familiar with the problem, so don't spend your time explaining trivial concepts or code snippets that are simple - your best bet is to focus your presentation on technological and design choices as well as the end-user functionality of your solution.

During a project review, you may get asked questions that test your understanding of covered topics.

- What is REST? How is it related to web services?
- What is the difference between structured, semi-structured, and unstructured data? Which file formats are typically used for each type of data?
- What is goose typing in Python? How does it differ from duck typing and static duck typing?

IMPORTANT: during the project review, you will also be asked to solve an exercise using Python.


## General Project Review Guidelines

For an in-depth explanation about how project reviews work at Turing College, please read [this doc](https://turingcollege.atlassian.net/wiki/spaces/DLG/pages/537395951/Peer+expert+reviews+corrections).
