Skip to content

Commit 78b7cf3

Browse files
author
TheovanKraay
committed
1 parent e84bc25 commit 78b7cf3

File tree

3 files changed

+8
-16
lines changed

3 files changed

+8
-16
lines changed

articles/cosmos-db/cassandra-spark-ddl-ops.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
---
22
title: DDL operations in Azure Cosmos DB Cassandra API from Spark
33
description: This article details keyspace and table DDL operations against Azure Cosmos DB Cassandra API from Spark.
4-
author: kanshiG
5-
ms.author: govindk
4+
author: TheovanKraay
5+
ms.author: thvankra
66
ms.reviewer: sngun
77
ms.service: cosmos-db
88
ms.subservice: cosmosdb-cassandra
99
ms.topic: how-to
10-
ms.date: 09/24/2018
10+
ms.date: 10/07/2020
1111

1212
---
1313

@@ -88,8 +88,7 @@ DESCRIBE keyspaces;
8888
### Create a table
8989

9090
```scala
91-
val cdbConnector = CassandraConnector(sc)
92-
cdbConnector.withSessionDo(session => session.execute("CREATE TABLE IF NOT EXISTS books_ks.books(book_id TEXT PRIMARY KEY,book_author TEXT, book_name TEXT,book_pub_year INT,book_price FLOAT) WITH cosmosdb_provisioned_throughput=4000 , WITH default_time_to_live=630720000;"))
91+
cdbConnector.withSessionDo(session => session.execute("CREATE TABLE IF NOT EXISTS books_ks1.books(book_id TEXT,book_author TEXT, book_name TEXT,book_pub_year INT,book_price FLOAT, PRIMARY KEY(book_id,book_pub_year)) WITH cosmosdb_provisioned_throughput=4000 , WITH default_time_to_live=630720000;"))
9392
```
9493

9594
#### Validate in cqlsh

articles/cosmos-db/cassandra-spark-read-ops.md

Lines changed: 4 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -81,17 +81,10 @@ readBooksDF.show
8181
You can push down predicates to the database to allow for better optimized Spark queries. A predicate is a condition on a query that returns true or false, typically located in the WHERE clause. A predicate push down filters the data in the database query, reducing the number of entries retrieved from the database and improving query performance. By default the Spark Dataset API will automatically push down valid WHERE clauses to the database.
8282

8383
```scala
84-
val readBooksDF = spark
85-
.read
86-
.format("org.apache.spark.sql.cassandra")
87-
.options(Map( "table" -> "books", "keyspace" -> "books_ks"))
88-
.load
89-
.select("book_name","book_author", "book_pub_year")
90-
.filter("book_pub_year > 1891")
91-
//.filter("book_name IN ('A sign of four','A study in scarlet')")
92-
//.filter("book_name='A sign of four' OR book_name='A study in scarlet'")
93-
//.filter("book_author='Arthur Conan Doyle' AND book_pub_year=1890")
94-
//.filter("book_pub_year=1903")
84+
val df = spark.read.cassandraFormat("books", "books_ks").load
85+
df.explain
86+
val dfWithPushdown = df.filter(df("book_pub_year") > 1891)
87+
dfWithPushdown.explain
9588

9689
readBooksDF.printSchema
9790
readBooksDF.explain
398 KB
Loading

0 commit comments

Comments
 (0)