Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
164 lines (109 sloc) 5.87 KB
title tags
Part 1 of kscript (Setting up & definitions): Writing a dynamic, interpreted, duck-typed language
general kscript

In this series, I plan to implement (from the ground up) a programming language that is dynamic, interpreted, and duck-typed. I plan to do it all in C (no C++ features required!).

I'm going to call it kscript, and it's going to be embedabble in other applications, easy to understand, and (hopefully) useful.

This tutorial just assumes basic programming knowledge. So, if you've programmed in C, C++, Java, or probably even Python, you should be fine. But, I'll be explaining the algorithms best I can on the way.

Let's begin!

Setup

In this part, I'll set up the project. We'll be writing in plain C, using Makefile-based builds. I'm trying to keep everything as simple as possible, with no room for magic. If you're not familiar, I'll include links and explain what I'm doing.

The repository for all this code is located here: https://github.com/ChemicalDevelopment/kscript

So, I started with a few files:

  • Makefile: This is what tells our project how to build. Yes we're using makefile, but I promise it's easy once it's set up
  • src/kscript.h: Header file for the project. This is what tells other programs what the project exports, defines, its types, etc
  • src/log.c: A simple logging library, with different levels of logging (this will make debugging easier)
  • src/kscript.c: This is the commandline binary we will run to run programs

Now, our Makefile will build two files:

  • libkscript.so: A shared-object library that any other C program can use! This way other applications could run kscript as, say, an embedded programming language.
  • kscript: This is the binary. Eventually, you can run files and expressions like ./kscript file.kscript. This might take a while though!

You can check the GitHub repo (specifically, the first commit) for full sources.

Here's src/kscript.c (we're just testing out and making sure everything works):

// src/kscript.c
#include "kscript.h"

int main(int argc, char** argv) {

    ks_info("Hello World!");

    return 0;
}

Now, our directory should look like:

.
├── Makefile
├── README.md
└── src
    ├── kscript.c
    ├── kscript.h
    └── log.c

Now, to test, run make && ./kscript:

$ make && ./kscript
cc -O3 -std=c99 -fPIC src/kscript.c -c -o src/kscript.o
cc -O3 -std=c99 -fPIC src/log.c -c -o src/log.o
cc -O3 -std=c99 -shared src/log.o -o libkscript.so
cc -O3 -std=c99 -L./ src/kscript.o -lkscript -o kscript
INFO : Hello World!

You can see that it worked, and printed out Hello World. Great! Now we have a working build system. Everytime we add a new .c file, we just add it to the makefile and run make again

Design

Now, let's go ahead and design our language by saying what it should (and shouldn't) do.

Overall, this language should:

  • Be easy to operate
  • Use builtin functions whenever possible
  • Override operators (i.e. always try and use a[i] instead of a.get(i)) whenever possible
  • Be able to do a lot of stuff in few lines of code
  • Don't explicitly mention types unless it's using reflection

So, sounds a lot like python eh? Well, I do think Python has a lot of good things, but I'm not going to clone Python.

Here are some things I like about python:

  • Fully dynamic, almost anything can be done with just a few lines of code
  • Duck Typing. This makes it easy for people to write extensible libraries that just rely on named methods (like .do() or .run())
  • ; are not required (but you can use them for multiple statements in a line)

And, here are some things I don't like about python:

  • Relevant whitespace. Some people will argue it's great, saves mental energy, enforces some style guidelines. However, I find myself running one-liners a lot, or at least from the command line. Python is awful at this. Tell me, how can you run an inner and outer for loop and it not look awful? In a language with {}-blocks, this is extremely clear.
  • Some weirdness with the standard library (like str vs bytes) has just made it kind of hard to remember - I'm finding myself how to .encode() and .decode() constantly. I don't like having to look things up, I think things should just work.

So, it'll be similar to python, but syntactically pretty different.

Here's a few examples I've came up with:

# hello_world.kscript

print ("Hello World!")

# fibonacci.kscript

fibonacci(n) := {
    if (n <= 1) {
        return 0 
    }
    return fibonacci(n-1) + fibonacci(n-2);
}

print ("fibonacci(", 0, ")=", fibonacci(0))
print ("fibonacci(", 1, ")=", fibonacci(1))
print ("fibonacci(", 2, ")=", fibonacci(2))
print ("fibonacci(", 3, ")=", fibonacci(3))
print ("fibonacci(", 4, ")=", fibonacci(4))

# matrix.kscript

A = [[1, 2], [3, 4]]
B = [[1, 0], [1, 1]]

matmul(a, b) := {
    if a[0].len != b.len {
        # error
    }
    c = [[0] * b[0].len] * a.len

    for i in 0:a.len {
        for j in 0:b[0].len {
            for k in 0:a[0].len {
                c[i][j] += a[i][k] * b[k][j]
            }
        }
    }
    return c
}

print (matmul(A, B))

Obviously, some of these will take some work to get working. They're just meant as milestones that are programming language should handle

Stick around for more parts! Next time, we'll start actually getting into writing an object interface in C (for dynamic types!)

Source for this part: https://github.com/ChemicalDevelopment/kscript/tree/48c6b0660b68b000c002994b9677724c486854a2

You can’t perform that action at this time.