From ad1719248a05024fd9820d8c7ba82630829f2bd8 Mon Sep 17 00:00:00 2001 From: "R. Tyler Croy" Date: Thu, 21 Jul 2022 08:41:16 -0700 Subject: [PATCH 1/4] Blog post highlighting the recordings of our Data and AI Summit talks --- _posts/2022-07-21-data-ai-summit-videos.md | 47 ++++++++++++++++++++++ 1 file changed, 47 insertions(+) create mode 100644 _posts/2022-07-21-data-ai-summit-videos.md diff --git a/_posts/2022-07-21-data-ai-summit-videos.md b/_posts/2022-07-21-data-ai-summit-videos.md new file mode 100644 index 0000000..9755ba5 --- /dev/null +++ b/_posts/2022-07-21-data-ai-summit-videos.md @@ -0,0 +1,47 @@ +--- +layout: post +title: "Data and AI Summit Wrap-up" +team: Core Platform +author: rtyler +tags: +- databricks +- kafka +- deltalake +- featured +--- + +We brought a whole team to San Francisco to present and attend this year's Data and +AI Summit, and it was a blast! +I +would consider the event a success both in the attendance to the Scribd hosted +talks and the number of talks which discussed patterns we have adopted in our +own data and ML platform. +The three talks I [wrote about +previously](/blog/2022/data-ai-summit-2022.html) were well received and have +since been posted to YouTube along with _hundreds_ of other talks. + +* [Christian Williams](https://github.com/xianwill) shared some of the +work he has done developing +[kafka-delta-ingest](https://github.com/scribd/kafka-delta-ingest) in his talk: +**[Streaming Data into Delta Lake with Rust and +Kafka](https://www.youtube.com/watch?v=do4jsxeKfd4&list=PLTPXxbhUt-YVWi_cf2UUDc9VZFLoRgu0l&index=195) +* [QP Hou](https://github.com/houqp), Scribd Emeritus, presented on +his foundational work to ensure correctness within delta-rs during his session: +**[Ensuring Correct Distributed Writes to Delta Lake in Rust with Formal +Verification](https://www.youtube.com/watch?v=ABoCnrVWCKY&list=PLTPXxbhUt-YVWi_cf2UUDc9VZFLoRgu0l&index=112) +* [R Tyler Croy](https://github.com/rtyler) co-presented with Gavin +Edgley from Databricks on the cost analysis work Scribd has done to efficiently +grow our data platform with **[Doubling the size of the data lake without doubling the +cost]( +https://www.youtube.com/watch?v=9QDRD0PzqCE&list=PLTPXxbhUt-YVWi_cf2UUDc9VZFLoRgu0l&index=122) + +Members of the Scribd team participated in a panel to discuss the past, +present, and future of Delta Lake on the expo floor. We also took advantage of +the time to have multiple discussions with our colleagues at Databricks about +their product and engineering roadmap, and where we can work together to +improve the future of Delta Lake, Unity catalog, and more. + +For those working in the data, ML, or infrastructure space, there are a lot of +_great_ talks available online from the event, which I highly recommend +checking out. Data and AI Summit is a great event for leaders in the industry +to get together, so we'll definitely be back next year! From a966efd7bb0382d2680cd8c2429db72e24d1f1a6 Mon Sep 17 00:00:00 2001 From: "R. Tyler Croy" Date: Fri, 22 Jul 2022 10:08:28 -0700 Subject: [PATCH 2/4] Update _posts/2022-07-21-data-ai-summit-videos.md Co-authored-by: Jim Park --- _posts/2022-07-21-data-ai-summit-videos.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/_posts/2022-07-21-data-ai-summit-videos.md b/_posts/2022-07-21-data-ai-summit-videos.md index 9755ba5..bbceb62 100644 --- a/_posts/2022-07-21-data-ai-summit-videos.md +++ b/_posts/2022-07-21-data-ai-summit-videos.md @@ -23,8 +23,7 @@ since been posted to YouTube along with _hundreds_ of other talks. * [Christian Williams](https://github.com/xianwill) shared some of the work he has done developing [kafka-delta-ingest](https://github.com/scribd/kafka-delta-ingest) in his talk: -**[Streaming Data into Delta Lake with Rust and -Kafka](https://www.youtube.com/watch?v=do4jsxeKfd4&list=PLTPXxbhUt-YVWi_cf2UUDc9VZFLoRgu0l&index=195) +[![Streaming Data into Delta Lake with Rust and Kafka](https://img.youtube.com/vi/do4jsxeKfd4/hqdefault.jpg)](https://www.youtube.com/watch?v=do4jsxeKfd4&list=PLTPXxbhUt-YVWi_cf2UUDc9VZFLoRgu0l&index=195) * [QP Hou](https://github.com/houqp), Scribd Emeritus, presented on his foundational work to ensure correctness within delta-rs during his session: **[Ensuring Correct Distributed Writes to Delta Lake in Rust with Formal From 224c8fac751e5a65ab5100f387ef13e2e63aab88 Mon Sep 17 00:00:00 2001 From: "R. Tyler Croy" Date: Fri, 22 Jul 2022 10:08:32 -0700 Subject: [PATCH 3/4] Update _posts/2022-07-21-data-ai-summit-videos.md Co-authored-by: Jim Park --- _posts/2022-07-21-data-ai-summit-videos.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_posts/2022-07-21-data-ai-summit-videos.md b/_posts/2022-07-21-data-ai-summit-videos.md index bbceb62..a38b4aa 100644 --- a/_posts/2022-07-21-data-ai-summit-videos.md +++ b/_posts/2022-07-21-data-ai-summit-videos.md @@ -26,8 +26,8 @@ work he has done developing [![Streaming Data into Delta Lake with Rust and Kafka](https://img.youtube.com/vi/do4jsxeKfd4/hqdefault.jpg)](https://www.youtube.com/watch?v=do4jsxeKfd4&list=PLTPXxbhUt-YVWi_cf2UUDc9VZFLoRgu0l&index=195) * [QP Hou](https://github.com/houqp), Scribd Emeritus, presented on his foundational work to ensure correctness within delta-rs during his session: -**[Ensuring Correct Distributed Writes to Delta Lake in Rust with Formal -Verification](https://www.youtube.com/watch?v=ABoCnrVWCKY&list=PLTPXxbhUt-YVWi_cf2UUDc9VZFLoRgu0l&index=112) +[![Ensuring Correct Distributed Writes to Delta Lake in Rust with Formal +Verification](https://img.youtube.com/vi/ABoCnrVWCKY/hqdefault.jpg)](https://www.youtube.com/watch?v=ABoCnrVWCKY&list=PLTPXxbhUt-YVWi_cf2UUDc9VZFLoRgu0l&index=112) * [R Tyler Croy](https://github.com/rtyler) co-presented with Gavin Edgley from Databricks on the cost analysis work Scribd has done to efficiently grow our data platform with **[Doubling the size of the data lake without doubling the From b0b4fc8c73d3a1433d63926694f74903239f7b5d Mon Sep 17 00:00:00 2001 From: "R. Tyler Croy" Date: Fri, 22 Jul 2022 10:08:37 -0700 Subject: [PATCH 4/4] Update _posts/2022-07-21-data-ai-summit-videos.md Co-authored-by: Jim Park --- _posts/2022-07-21-data-ai-summit-videos.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/_posts/2022-07-21-data-ai-summit-videos.md b/_posts/2022-07-21-data-ai-summit-videos.md index a38b4aa..828f149 100644 --- a/_posts/2022-07-21-data-ai-summit-videos.md +++ b/_posts/2022-07-21-data-ai-summit-videos.md @@ -30,9 +30,8 @@ his foundational work to ensure correctness within delta-rs during his session: Verification](https://img.youtube.com/vi/ABoCnrVWCKY/hqdefault.jpg)](https://www.youtube.com/watch?v=ABoCnrVWCKY&list=PLTPXxbhUt-YVWi_cf2UUDc9VZFLoRgu0l&index=112) * [R Tyler Croy](https://github.com/rtyler) co-presented with Gavin Edgley from Databricks on the cost analysis work Scribd has done to efficiently -grow our data platform with **[Doubling the size of the data lake without doubling the -cost]( -https://www.youtube.com/watch?v=9QDRD0PzqCE&list=PLTPXxbhUt-YVWi_cf2UUDc9VZFLoRgu0l&index=122) +grow our data platform with: +[![Doubling the size of the data lake without doubling the cost](https://img.youtube.com/vi/9QDRD0PzqCE/hqdefault.jpg)](https://www.youtube.com/watch?v=9QDRD0PzqCE&list=PLTPXxbhUt-YVWi_cf2UUDc9VZFLoRgu0l&index=122) Members of the Scribd team participated in a panel to discuss the past, present, and future of Delta Lake on the expo floor. We also took advantage of