-
Notifications
You must be signed in to change notification settings - Fork 15
/
Copy pathindex.qmd
116 lines (89 loc) · 4.42 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
---
format:
html:
css:
- _static/css/design-style.css
link-external-icon: true
link-external-newwindow: true
toc: false
listing:
- id: cards
template: _templates/card.ejs
contents:
- name: "Real-time Aggregation"
icon: bi bi-clock-history
content: Precompute model inputs from streaming data with robust data connectors, transformations & aggregations.
- name: Event Detection
icon: bi bi-binoculars-fill
content: Trigger pro-active AI behaviors by identifying important activities, as they happen.
- name: History Replay
icon: bi bi-skip-backward-fill
content: Backtest and fine-tune from historical data using per-example time travel and point-in-time joins.
---
::::: {.px-4 .py-5 .my-5 .text-center}
<img class="d-block mx-auto mb-4 only-light" src="_static/images/kaskada-positive.svg" alt="" width="50%">
<img class="d-block mx-auto mb-4 only-dark" src="_static/images/kaskada-negative.svg" alt="" width="50%">
<h1 class="display-5 fw-bold">Real-Time AI without the fuss.</h1>
::: {.col-lg-7 .mx-auto}
[Kaskada is a next-generation streaming engine that connects AI models to real-time & historical data.]{.lead .mb-4}
:::
:::::
## Kaskada completes the Real-Time AI stack, providing...
::: {#cards .column-page}
:::
## Real-time AI in minutes
Connect and compute over databases, streaming data, _and_ data loaded dynamically using Python..
Kaskada is seamlessly integrated with Python's ecosystem of AI/ML tooling so you can load data, process it, train and serve models all in the same place.
There's no infrastructure to provision (and no JVM hiding under the covers), so you can jump right in - check out the [Quick Start](./guide/quickstart.qmd).
## Built for scale and reliability
Implemented in [Rust](https://www.rust-lang.org/) using [Apache Arrow](https://arrow.apache.org/), Kaskada's compute engine uses columnar data to efficiently execute large historic and high-throughput streaming queries.
Every operation in Kaskada is implemented incrementally, allowing automatic recovery if the process is terminated or killed.
With Kaskada, most jobs are fast enough to run locally, so it's easy to build and test your real-time queries.
As your needs grow, Kaskada's cloud-native design and support for partitioned execution gives you the volume and throughput you need to scale.
Kaskada was built by core contributors to [Apache Beam](https://beam.apache.org/), [Google Cloud Dataflow](https://cloud.google.com/dataflow), and [Apache Cassandra](https://cassandra.apache.org/), and is under active development
* * *
## Example Real-Time App: BeepGPT
[BeepGPT](https://github.com/kaskada-ai/beep-gpt/tree/main) keeps you in the loop without disturbing your focus. Its personalized, intelligent AI continuously monitors your Slack workspace, alerting you to important conversations and freeing you to concentrate on what’s most important.
The core of BeepGPT's real-time processing requires only a few lines of code using Kaskada:
```python
import asyncio
import kaskada as kd
kd.init_session()
# Bootstrap from historical data
messages = await kd.sources.PyDict.create(
rows = pyarrow.parquet.read_table("./messages.parquet")
.to_pylist(),
time_column = "ts",
key_column = "channel",
)
# Send each Slack message to Kaskada
def handle_message(client, req):
messages.add_rows(req.payload["event"])
slack.socket_mode_request_listeners.append(handle_message)
slack.connect()
# Aggregate multiple messages into a "conversation"
conversations = ( messages
.select("user", "text")
.collect(max=20)
)
# Handle each conversation as it occurs
async for row in conversations.run_iter(mode='live'):
# Use a pre-trained model to identify interested users
prompt = "\n\n".join([f'{msg["user"]} --> {msg["text"]}' for msg in row["result"]])
res = openai.Completion.create(
model="davinci:ft-personal:coversation-users-full-kaskada-2023-08-05-14-25-30",
prompt=prompt + "\n\n###\n\n",
logprobs=5,
max_tokens=1,
stop=" end",
temperature=0.25,
)
# Notify interested users using the Slack API
for user_id in interested_users(res):
notify_user(row, user_id)
```
For more details, check out the [BeepGPT Github project](https://github.com/kaskada-ai/beep-gpt).
* * *
## Get Started
Getting started with Kaskda is a `pip install kaskada` away.
Check out the [Quick Start](./guide/quickstart.qmd) now!