This project demonstrates how to build a REST API for Spark User Defined Functions (UDFs) using PySpark. The API allows users to invoke Spark UDFs through HTTP requests, enabling efficient data processing and transformation via a web interface.
- REST API for Spark UDFs: Exposes Spark UDFs over HTTP.
- PySpark Integration: Seamlessly integrates with Spark for scalable data processing.
- JSON-based Communication: Accepts and returns data in JSON format.
Before running the project, ensure you have the following installed:
- Python 3.7+
- Apache Spark (compatible version with PySpark)
- PySpark
- Requests (for testing the API)
git clone https://github.com/Hamdi1996/Spark-User-Defined-Function-REST-API.git
cd Spark-User-Defined-Function-REST-API- UDF Definitions: Add or modify UDFs within the Jupyter notebook.
Contributions are welcome! Feel free to open issues or submit pull requests to improve the project.
This project is licensed under the MIT License. See the LICENSE file for more details.