Skip to content

api-evangelist/pyspark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Apache PySpark (pyspark)

Python API for Apache Spark - A unified analytics engine for large-scale data processing

URL: Visit APIs.json URL

Tags:

  • big data, distributed computing, data processing, machine learning, streaming, python

Timestamps

  • Created: 2024
  • Modified: 2024

APIs

PySpark Core API

Core Spark functionality including RDDs, SparkContext, and basic operations

Human URL: https://spark.apache.org/docs/latest/api/python/reference/pyspark.html

Tags:

  • rdd, spark-context, core

Properties

PySpark SQL

Structured data processing with DataFrame and SQL operations

Human URL: https://spark.apache.org/docs/latest/sql-programming-guide.html

Tags:

  • dataframe, sql, structured-data

Properties

PySpark Streaming

Real-time stream processing capabilities

Human URL: https://spark.apache.org/docs/latest/streaming-programming-guide.html

Tags:

  • streaming, real-time, dstream

Properties

PySpark MLlib

Machine learning library with scalable algorithms

Human URL: https://spark.apache.org/docs/latest/ml-guide.html

Tags:

  • machine-learning, mllib, algorithms

Properties

PySpark ML (DataFrame-based)

DataFrame-based machine learning API

Human URL: https://spark.apache.org/docs/latest/ml-pipeline.html

Tags:

  • machine-learning, pipeline, dataframe

Properties

PySpark GraphX (GraphFrames)

Graph processing and analysis capabilities

Human URL: https://graphframes.github.io/graphframes/docs/_site/index.html

Tags:

  • graph, graphframes, network-analysis

Properties

Common Properties

Maintainers

FN: Apache Software Foundation

Email: dev@spark.apache.org

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors