Skip to content

Commit

Permalink
initial pass
Browse files Browse the repository at this point in the history
  • Loading branch information
dovinmu authored and saulpw committed Aug 1, 2023
1 parent 3bee18c commit 1cd1a22
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 34 deletions.
33 changes: 5 additions & 28 deletions about/23-design.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

# Design

AIPL is intended as a simple platform for quick proof of concept data pipelines to be implemented and tested.
AIPL is intended as a simple platform for quick proof of concept AI-based data pipelines to be implemented and tested.

## Why?

Expand Down Expand Up @@ -41,7 +41,7 @@ At the very least, AIPL should be a useful tool to learn, explore, and prototype

## What is "implicit looping"?

a concept borrowed from APL.
It's a concept borrowed from APL.

Yes, APL, that language from the 60s that looks like this:

Expand All @@ -59,7 +59,7 @@ Now before you run away screaming, there are 3 big ideas in APL, and why Iverson

APL uses a special set of non-text symbols, a custom alphabet that nearly predates ASCII itself.
This is why it looks like alien gibberish to the uninitiated, and why APL has all but died out.
[Iverson's paper and talk for the Turing Award is entitled [Notation as a Tool of Thought]()
[Iverson's paper and talk for the Turing Award is entitled Notation as a Tool of Thought
so "notation" is ironically the focus *and* the fallacy of APL.]

The symbology is math-based (as APL is a language for teaching and doing linear algebra), and is elegantly designed. but the idea is unfortunately a non-starter for modern adoption.
Expand All @@ -80,7 +80,7 @@ When done well, these legos are composable without anything else necessary to bi

---

Thus, AIPL borrows implicit looping and tacit programming from APL, and lets go of its alien symbology.
So AIPL borrows implicit looping and tacit programming from APL, and lets go of its alien symbology.
AIPL also borrows some of APL's vocabulary, but since data pipelines are a much different domain than math (and much more has been developed in the data domain over the past 50 years), we need to develop a different set of operators.

So AIPL is also a *vocabulary discovery platform*.
Expand All @@ -101,10 +101,7 @@ Don't over-engineer your experiments and your prototypes.
Just put legos together in a logical order and see how the whole chain works.
Tune, iterate, and discover quickly if your idea is viable or not.

# The Design


# Design
# The data table

Operators take 0, 1, or 2 "operands with dimensionality", and any number of scalar (int/float/str) parameters.

Expand All @@ -130,23 +127,3 @@ These operators must use the consistent pattern for iterating over the table's d
Tables are more complex than simple vectors.
But ideally, an operator could be defined only by its smallest operation, and a decorator(?) would do the consistent iteration.

split iterates over its input, and where it finds a string, returns a 1 column table.
where it finds an int/float, it errors or returns the int.
where it finds a table, it recurses.

take:
where it finds a simple table, returns a table with only N rows

join:
a simple table of strings, returns a string

parse-url:
a url string, returns a table with 1 row

unravel:
a table of simple tables, returns a simple table

filter:
a table with a bool value column; returns a table without that column, with only rows for which that column was true


6 changes: 3 additions & 3 deletions about/23-faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,16 @@ Before reaching that stage, you need to know how your idea can be done with AI i
You may need to explore which of the available models might be better or cheaper, figure out how exactly you would have to organize the pipeline so that you can get the results you need, and engineer the literal prompts themselves.
You might even have to scrap the idea altogether if you can't get GPT (or whatever LLM) to respond accurately--and if that's the case, you want to find that out quickly, before investing any real resources.

You want something quick-and-dirty to experiment with. You want to be able to whip up a prototype in a couple hours.
You want something quick-and-dirty to experiment with. You want to be able to whip up a prototype in a couple hours.

But you need something bigger than prompting directly to ChatGPT within the browser. It's fine for testing one thing, and you can do the pre- and post-processing yourself by hand. For anything greater than N=1, though, you're already wanting something more reproducible.

For instance, here's a script to summarize any number of webpages: https://github.com/saulpw/aipl/blob/develop/examples/summarize.aipl

At the tiny sets (N=10 or N=100) we use to validate our ideas,

To do this in Python would involve being explicit about iteration, caching, error-handling, and the result would be a more unwieldy script, with the requisite quoting and/or escaping, code out-of-order and perhaps code scattered across multiple files, even the boilerplate--these things add friction for someone who knows Python, and make it impossible for a non-coder.

At the tiny sets (N=10 or N=100) we use to validate our ideas, we want our focus to be on the experiments themselves as much as possible.

There's a progression of computational tools: from calculators, to spreadsheets, to notebooks, to scripts, to programs, to systems. Each level gives you more power and flexibility, but requires more attention and skill.

In this context, ChatGPT is only a calculator, while Python is used to create programs and systems. Python notebooks are useful but have their quirks, and don't scale well without explicit adjustments.
Expand Down
6 changes: 3 additions & 3 deletions about/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# About AIPL

AIPL is a pseudo-computer language (skin on top of Python) that makes it easy to develop prototypes for data processing tasks.
AIPL is a pseudo-computer language (skin on top of Python) that makes it easy to develop prototypes for data processing tasks, with language models as first class citizens.

- [Announcement](23-announcement.md)
- [](23-design.md)
- [](23-faq.md)
- [Design](23-design.md)
- [FAQ](23-faq.md)

0 comments on commit 1cd1a22

Please sign in to comment.