ficus 1.0 alpha TODO #4

vpisarev · 2021-01-01T15:13:28Z

This is the list of features to be implemented before ficus 1.0. Some of the features are optional and marked as such.

- for a@index <- A, i.e. the notation to get indices of the array element elements, not just their values. index is a tuple in the case of multi-dimensional arrays.
- (...) notation in generic function argument types to denote that the particular argument should be a tuple, but its size can be an arbitrary (for each specific size and each specific set of tuple element types a separate instance of generic function is created). There is a similar notation for records: {...}
- 'typ [+] notation to denote array arguments of varying dimensionality. For each specific dimensionality a separate instance is created.
- [\A; \B], i.. use of \array notation for compound array initialization. [\A1; \A2; ... ; \An] denotes "vertical" concatenation of arrays, where rows of \Ai are put next to each other. [\A1, \A2, ..., \An] denotes horizontal concatenation, where columns of \Ai are put next to each other. There can also be a mixed combination, and also it's possible to put scalar elements.
- support for inline functions; automatic inline expansion of small functions
- elimination of index range checks when arrays are accessed sequentially and unconditionally inside the loops.
- optimized concatenation of multiple strings. Instead of iterative construction of "a" + "b" + "c" + "d" + ... as (("a" + "b") + "c") + "d" ..., which results in multiple memory allocations and super-linear complexity, the compiler computes the final concatenation with a single call of intrinsic function.
- when keyword in the patten matching to add custom predicates.
- possibility to process multiple cases in match expression with a single action, e.g. match s { | "a" | "b" => foo() | _ => bar() }. Currently, when using multiple cases, it's not possible to capture values, but this is a minor inconvenience.
- object types that will let users write lst.rev() instead of List.rev(lst) etc. This is very light-weight method to introduce object-oriented notation, even for very basic, intrinsic types.
- str[.-n] syntax, i.e. unary minus with preceding dot. It helps to conveniently access the last string character, the last element of array etc.
- [very desirable, but optional] fun foo (a: t1, b: t2, ~param3:t3=defval3) notation for named/optional parameters.
- [unzip for ...] notation to produce a tuple of arrays as the result of comprehension instead of a single array of tuples.
- [optional] automatic detection of "nothrow" functions. If such functions are called directly, we can eliminate the check of return value
- automatic detection of stack overflow. Ficus exception is thrown in such a case.
- detection of division by zero. Ficus exception is thrown in such a case.
- (mostly done except for variants) automatically generate print() and string() for any type
- automatically generate comparison function for any type.
- OOP (object types (see above) + interfaces)
- (postponed till after 1.0; hand-written ficus lexer has been implemented) ~~port the lexer generator to ficus (fxlex)~~
- (postponed till after 1.0; hand-written ficus parser has been implemented); ~~port the parser generator to ficus (fxyacc)~~
- port ficus compiler to ficus
- implement nested modules
- (postponed till after 1.0) ~~[optional] module interfaces (separate files with .fxi extension, similar to .mli files in Ocaml)~~
- separated compilation when each ficus module is compiled into a dedicated .c file
- (there is never enough tests, but at least we have some) add extensive set of unit tests
- write ficus tutorial
- parallel keyword before for loops (the current implementation uses OpenMP)
- [optional] implement full-scale Python3-like format f"{:}"
- [optional] implement image/matrix/tensor border extrapolation using img.clip[y,x], img.zero[y,x] or similar notation.
- implement loop fusion
- add immutable RRB vector data type and the associated operations including concatenation, comprehension etc.
- implement a small but versatile standard library (see ficus stdlib TODO #5) (for 1.0 release only a small part of the planned library will be implemented)

==== some extra items ====

- add optional finally clause into try-catch to close files & sockets, unlock mutexes etc. safely. Shall we also add Python-like with expression?
- check for invalid type casts, implement more type cast modes (literal to tuple, tuple to tuple, tuple to scalar type (in the latter case it means that each tuple element is casted to this scalar type and this rule is applied recursively)
- [optional] optimize tuple packing/unpacking
- [optional] add back quote syntax `expr` that expands to (expr, expr-captured-as-string, modulename, line). The feature will help to make assert's much more informative.
- more intelligent type inference in the case of overloaded/template functions and type constructors (in particular, Math.... functions, None constructor etc.)
- insert more try-catch blocks into the compiler to catch compile errors
- (maybe later) [optional] add some syntax sugar for containers. Convert for (k, d) <- mymap {...} into Map.app(), fold ... for x <- myset {...} into Set.foldl(...) etc. In general, if the object we iterate over is a module type, we will convert the loop body into a lambda function and depending on whether it's a list comprehension, folding or a simple loop, we convert it to map, foldl or app.
- [optional] add syntax to embed data files into source code, val myfont = @data "myfont.ttf"
- more smooth support for the records and record patterns in functions.
- (optional) not only literals, but also id's as default initializers in the record fields or default function parameters, e.g. None etc.
- (optional) string() and print() for function types. Maybe it should also have the __name__ attribute.
- report warnings about unused variables. preferably also report about unused or unmatched variant tags in match {} expressions.
- it looks like immutable RRB vector (see item 32, now postponed) can be super-efficient in ficus' basic operations - comprehensions and iterations. And it will do many other things in O(1) or O(logN) time. Therefore, it should become the main collection type. Therefore, we need to change array comprehensions and literals from [ ... ] to [| ... |], and leave [...] to vectors. cons-based lists will stay [: ... :].
- maybe refactor compiler to speedup incremental compilation mode when just a few lines in one of the module are modified. This should include caching of k-forms, a bit different handling of template functions/types, and completely separable compilation and maybe even some changes in the fundamental id_t type. The goal is to avoid influence of one module's k-form to another, i.e. if the module itself or the modules it depends on are not edited, it's initial k-form should be exactly the same (and thus we can eliminate the stage of k-form optimization, c code generation and c code compilation alltogether, not just the latter one).
- implement hash table, use it whenever possible instead of Map or Set, e.g. inside K_remove_unused.
- maybe add non-type template arguments to make Map and Set more efficient.
- add "-rebuild" option to rebuild an app from scratch.
- refactor backtrace dump. Use the system call backtrace() to retrieve backtrace (just the stack). Do not use "update backtrace" inside FX_CALL() macro
- test everything on Linux
- need to extend pattern matching to allow nested alternating patterns, e.g. match tokens { | (PLUS | MINUS) as op :: rest => ... }. This will also solves the problem with when, which is now not clear where to attach in the case of several alternatives.
- stylistic change. need to normalize naming where possible. For example, function formal parameters should be called "parameters", not "arguments". Whereas the actual arguments passed to function at runtime should be called "arguments". Now arguments are used almost everywhere.
- (optimization idea: just put it here for now not to forget). When internal functions access non-local values from outer scope, and those internal functions are not passed around, just called, they can be augmented to take all those extra values as arguments. Probably, the best place to do it is inside k_lift, because mutable values from outer scope still need to be converted to references before being passed as arguments.
- (update: mimalloc was replaced with a tiny yet mighty rpmalloc, which is now included into ficus runtime) download and build (or include inside or make a git submodule) mimalloc as a dependency. mimalloc significantly boosts performance of the compiler and, in general, ficus-generated code. So, it would be nice to have it enabled by default.
- (maybe later) ~~maybe build everything (mimalloc, other useful 3rd-party libraries, runtime, reference ficus compiler, ficus compiler, tests, examples etc.) using cmake~~
- (maybe later, openmp is quite good) ~~add task framework into the runtime and the standard library, probably based on https://github.com/Pithikos/C-Thread-Pool.~~
- (openmp runtime for mac is included, on linux it's always available, on windows need to test, but it should work fine there too) implement @parallel for based on the task framework, so that it's always available, not only when the produced binaries are linked with OpenMP runtime. Getting rid of OpenMP dependency will also make ficus binaries more compact. Note that OpenMP-based parallel for backend is also possible.
- (ficus compiler is built from scratch in 21 seconds with -O1 flag on M1 macbook pro. On Linux it takes <15 seconds. That is, it's fast enough) try to make ficus compiler as parallel as possible. At minimum make the last phase parallel - generation of the C code, emitting C code, invoking C compiler on the updated files. At maximum, K-form optimization should also become mostly parallel with some synchronization points.
- test everything on Windows
- @sync-blocks for map-reduce style algorithms.
- mutable record fields
- Some form of conditional compilation, e.g. -pp flag that will run .fx code through cpp preprocessor. In addition, there should also -Dparam[=value] command-line option to define certain macros that are passed to the preprocessor

The text was updated successfully, but these errors were encountered:

vpisarev · 2022-04-01T11:49:11Z

ok, except for a few items this is mostly done. See the new #16

vpisarev added the evolution label Jan 2, 2021

vpisarev added this to the 1.0 milestone Jan 2, 2021

vpisarev changed the title ~~ficus 1.0 TODO~~ ficus 1.0 alpha TODO Apr 1, 2022

vpisarev mentioned this issue Apr 1, 2022

ficus 1.0 TODO #16

Open

vpisarev closed this as completed Apr 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ficus 1.0 alpha TODO #4

ficus 1.0 alpha TODO #4

vpisarev commented Jan 1, 2021 •

edited

Loading

vpisarev commented Apr 1, 2022

ficus 1.0 alpha TODO #4

ficus 1.0 alpha TODO #4

Comments

vpisarev commented Jan 1, 2021 • edited Loading

vpisarev commented Apr 1, 2022

vpisarev commented Jan 1, 2021 •

edited

Loading