Introduction

Spark data source for retrieving DNS A type records from DNS server.
The spark DNS data source uses zone transfers to retrieve data from DNS server.
It tries to use IXFR for every zone transfer though some DNS server implementation may return AXFR response.

The spark DNS data source may operate on multiple DNS zones in single data frame.
Due to nature of DNS zone transfer, data retrieval for single zone transfer cannot be done in parallel,
though data from multiple zones is retrieved in parallel (each DNS zone is handled in different Spark partition of RDD)

Rationale

Learning Spark internals
integrating Spark with 3rd party data sources
Just for fun

Features and limitations

Limitations

Providing multiple DNS servers in options for same the same dataset/table is currently not supported
Continuous Structured Streaming is not supported yet
On Spark 2.4 (incl CDH 6.3.x) only batch reading is supported.

Currently implemented features

Spark batch read
Retrieving DNS A records from multiple DNS zone (though from single DNS server)
New DNS SOA serial of DNS zone is available in Accumulator via Spark UI (refer to relevant stage)
Spark Structured Streaming read support (Only trigger Once and Prcessing time is supported)
Zone transfer timeout
Specifying explicit zone transfer type (AXFR/IXFR) to use when retrieving data from DNS server.
- When suing xfr=ixfr, only DNS zone updates from initial serial will be returned.
  - On Structured Streaming this may produce empty DataFrames on no updates
- When using xfr=axfr, entire DNS zone A records will be returned
Handling temporary failures during zone transfer (similar to failOnDataLoss in Spark+Kafka)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for ignoring XFR failures

Introduction

Rationale

Features and limitations

Limitations

Currently implemented features

Releases: yurkao/spark-dns

Spark-DNS 1.0.3

Spark-DNS 1.0.2

Features added

Support for ignoring XFR failures

Spark-DNS

Introduction

Rationale

Features and limitations

Limitations

Currently implemented features