Skip to content

Conversation

@jklwn
Copy link

@jklwn jklwn commented Sep 4, 2019

This pull request is ready now for review.

It has four sections (each one split into several commits):

  • compiler changes and additions mostly for the preprocessor to allow for more flexible macros
  • compiler additions for the run time library implementation
  • an additional include file as FreeBASIC front end for new array features
  • additions to the run time library

There are comments after each section for further information

JK

…variadic parameter may contain commas, so it may expand to more than one parameter, this also allows for a macro to receive less arguments than specified in the macro´s definition, if there is a variadic parameter.
…ty variadic paramter, the else clause was missing
…efining an already defined key word, the existing definition is not lost, but temporarily superseded by the new definition. The original definition becomes valid again as soon as a scope block is left, or by using #UNDEF, in global scope #UNDEF is mandatory.
…at is, it returns the argument count of a (variadic) parameter. An empty parameter counts as one parameter. This can be extremly useful for processing variadic macro parameters.
… as "#" operator) but as uppercase. This is useful for case independent parsing of macro arguments.
@jklwn
Copy link
Author

jklwn commented Sep 4, 2019

The first section (mostly for the preprocessor to allow for more flexible macros) is ready now.

TYPEOF :
return the type of a variable at run time (previous version worked only in preprocessor i.e. at compile time. Might not be the most elegant solution, a workaround for a missing "variabletype" function. Returning the variable type as string has one major advantage over a "variable type derived from dtype" approach, it allows for keeping apart UDTs by their name. Accepts "()" for array parameters too.

Macro parameter check modification:
the compiler checks the parameter count of macros and throws a compile error, if the actual number of parameters doesn´t match the definition. This check has been disabled in case of a variadic macro parameter. That is, a variadic parameter may now not only be missing at all, but also qualify as one or more parameters. A variadic parameter may have commas inside, which in effect makes it more than one parameter. A macro may now also receive less parameters than specified in it´s definiton.

BUGFIX:
stringize ("#") didn´t work for non-ANSI files and an empty (variadic) parameter. The approriate code was simply missing.

#REDEF :
allows for temporarily re-defining (and thus overriding) an already existing #define (which was forbidden before). This new define is valid only in the current scope and is automatically restored to it´s previous definition when a scope ends. In global scope you may use #UNDEF to restore the previous #define. This allows for otherwise forbidden key words to be used in macros and elsewhere

"###" :
new preprocessor operator, which returns the number of arguments (count of commas + 1) in a variadic macro parameter.

"#&#" :
stringize uppercase, that is a macro parameter is returned as uppercase string - same as "#", but uppercase.

JK

…rayCalcPos, fb_ArrayCalcIdxPos, fb_ArrayCalcIdxPtr, fb_ArrayShift, fb_ArraySort, fb_ArrayAttach, fb_ArrayReset, fb_ArrayScan)
@jklwn
Copy link
Author

jklwn commented Sep 4, 2019

Section two is ready. This adds all new RTL function definitions needed for the new array features

It might still be necessary to apply minor changes to these definitions in rtl-array.bas later on.

JK

@jklwn
Copy link
Author

jklwn commented Sep 5, 2019

As a third step "array.bi" has been added. This file contains the necessary definitions and code. It will be updated along with the following commits for the RTL in order to activate the provided features.

JK

…ompile time errors (#ERROR ...) inside macros is one line off sometimes, #LINE __PREVLINE__ can fix this.
… overlay and for resetting such an array to default state again.
…nter, linear (one based) index and (multidimensional) array indices
…rom an existing array. This doesn´t change (REDIM) the element count of an array
…ions for each standard variable type, as well as support for custom sort functions
… literal or expression, custom search functions are suported too.
@jklwn
Copy link
Author

jklwn commented Sep 7, 2019

The last section covers additions to the RTL and corresponding changes to the compiler and "array.bi"

All new features work with all kinds of arrays regardless of the data type. Passed z/wstring arrays don´t work because of an existing bug. Applying #158 to this PR fixes this problem.

Arrays:
you could think of an array as a list of consecutive elements of the same kind in memory. An element could be a standard variable type or a structure (user defined type). An element is accessed by it´s index. The first element gets index 1, the second element gets index 2 and so on. Actually you may define an array to begin at a different index than 1, the compiler is clever enough to do the necessary math for you.

This seems pretty obvious so far, it starts to become a bit more complicated with multidimensional arrays. That is, you must specify more than one index value for accessing an element. The memory layout is the same, but the overlaying logic is different. Nevertheless any multidimensional may be seen as a one dimensional array too.

In fact "DIM a(1 to 8) AS BYTE" allocates the same amount of memory as "DIM b(1 to 2, 1 to 2, 1 to 2) AS BYTE". In case of a() the first element in memory is a(1), the second a(2) and so on, in case of b(), the first element is b(1,1,1), the second element in memory is b(1,1,2), the third is b(1,2,1).

If we look at b() as if it was a one dimensional array with a starting index of 1, we would have a "linear index" of 1 for (1,1,1), of 2 for (1,1,2), 3 for (1,2,1) and so on.

All array features work for multidimensional arrays too. You may specify a certain element by it´s "regular index" (x,y,z,...) or by it´s "linear index". It is important to understand, that array scan always returns the linear index, which is identical to the one dimensional index with a starting index of 1.

Example: for "DIM a(1 to 8) AS BYTE" a linear index of 4 denominates element 4 for "DIM a(10 to 18) AS BYTE" a linear index of 4 denominates element 13 (!)

Syntax:

array(sort, ... no return value


sort an array of arbitrary type, for arrays of UDT´s you must define and use a callback function for sorting, for all standard variable types there is a readymade sorting function in the RTL or you may use a user defined callback, if you like. In general using a callback function is slower, because of the overhead of multiple calls.

The default sorting direction is ascending ("up"), if not specified otherwise. For sorting in descending order you must specify "down".

By default all elements of an array are sorted. You may specify a starting index and a count of elements to sort. If no count is given, all elements up to the end of the array are sorted.

The starting index may be given as index (index value(s) in parenthesis) as usual "(x,y,...)", or a linear index. In case of a linear index, the key word "pos" is required (pos().

You may specify a second array of arbitrary type to be sorted in the same order as the first
array. That is both arrays can be of different types. In this case both arrays must be enclosed
in parenthesis (see example #4).

Arrays of strings (zstring/wstring/string and ustring) may be sorted case insensitive. Default is case sensitive. For a case insensitive sorting you must specify the "nocase" key word in parenthesis with the string array to sort (see example #3 and #4).


array(sort, array)
array(sort, array, down [, (i1 [, i2 [, ...]]) | pos(i), [, count]])
array(sort, (array, nocase))
array(sort, (array1, array2, nocase) [, (i1 [, i2 [, ...]]) | pos(i), [, count]])
array(sort, (array, nocase), down [, (i1 [, i2 [, ...]]) | pos(i), [, count]])
array(sort, array, @customsortproc [, (i1 [, i2 [, ...]]) | pos(i), [, count]])

array(insert, ... no return value


Inserts an element into an existing array by shifting the already present elements up by one. Currently the last element is shifted out and thus deleted. For elements with a destructor, the destructor
is called. You can prevent this by re-dimming the array to an appropriate size before inserting new elements.

"Value" must be of the same variable type as the array itself and is assigned to the given position. You may only insert one element at a time.

Position may be a regular index or a linear index. By default all elements up to the end of the array are shifted. Count restricts this shift operation to the specified number of elements.
The last element covered by "count", is deleted in this case.


array(insert, array, value [, (i1 [, i2 [, ...]]) | pos(i), [, count]])

array(delete, ... no return value


Deletes an element into an existing array by shifting the already present elements down by one. The last is initialized to the data type´s default value. For elements with a constructor, the
default constructor is called. You may prevent a last orphaned element by re-dimming the array
to an appropriate size after deleting elements.

You may only delete one element at a time.

Position may be a regular index or a linear index. By default all elements up to the end of the array are shifted. Count restricts this shift operation to the specified number of elements. The last element covered by "count", is initialized to it´s default in this case.


array(delete, array [, (i1 [, i2 [, ...]]) | pos(i), [, count]])

array(scan, ... returns linear index


Searches an array for a specified value (for(). "Searchterm" must be a compatible type. That is, you may mix all numeric types (except floating point types) and all string types. Obviously not all allowed combination make sense under all circumstances. It´s up to the user to interpret results in correct manner. A value of zero is returned, if searchterm is nor present in the array, otherwise the linear position (!) of the first occurrence is returned.

Position may be a regular index or a linear index. By default all elements up to the end of the array are searched. Count restricts this search operation to the specified number of elements.

Arrays of strings (zstring/wstring/string and ustring) may be searched case insensitive. The default is case sensitive. For a case insensitive sorting you must specify the "nocase" key word in the for() clause (see example #2).

For arrays of UDT´s you must define and use a callback function for searching, for all standard variable types there is a ready-made search function in the RTL. You may use a user defined callback for special searches, if you like. In general using a callback function is slower, because of the overhead of multiple calls.

A return value of 0 may indicate either "not found" or "an error occurred", you must check ERR in this case.


i = array(scan, array, for()[, (i1 [, i2 [, ...]]) | pos(i), [, count]])

for(searchterm [,nocase [,from])
for(searchterm [,from])
for(@customproc)
'***********************************************************************************************

array(, ...


Return the requested value from the array´s descriptor. Some values are already available
either directly or by doing some math. This offers a consistent interface for all values,
which might be useful.


i = array(specifier, array), returns any ptr (1), integer (2,3,4,5) or boolean (6,7,8)
'***********************************************************************************************
data 'pointer to the first array element in memory
dimensions '# of dimensions (same as UBOUND(array, 0)
total_size 'in bytes
total_count 'total # of elements in all dimensions
size 'same as sizeof(array)
is_fixed_len 'fixed len array
is_fixed_dim 'fixed dimension array
is_dynamic 'dynamic array
is_attached 'attached or not

array(, ...


Calculate the linear index form a regular index and vice versa, do the same for a given memory pointer.

This is necessary for retrieving the regular index from the linear index returned by "array(scan, ...", and it is necessary when dealing with multidimensional arrays.


i = array(pos, array, (i1 [, i2 [,...]])) returns linear (one based) position from index
p = array(ptr, array, (i1 [, i2 [,...]])) returns memory ptr for this index
u = array(index, array, pos) returns array_index from linear index
u = array(index, array, ptr) returns array_index

array(attach, ...
array(reset, ...


Redim an array at a specific memory location. You must specify the index (or indices) as usual "REDIM()". No memeory is allocated or deallocated afterwards, instead existing memory is made accessible by an overlay structure in form of an array. This can be useful for all kinds of direct data manipulation.

Array reset erases the array descriptor again without erasing the memory.

This feature requires an empty (not yet dimmed or erased) dynamic array and doesn´t work with
fixed size arrays.


array(attach, array, redim(1 to 5[, 1 to 6 [1, ...]]), memory ptr)
array(reset, array)
'***********************************************************************************************

@jklwn
Copy link
Author

jklwn commented Sep 7, 2019

I didn´t know that Travis fails on warnings ...

This PR is ready now for review. I coded and ran tests, which will have to be ported to the test suite as soon as this PR is in a state, that it will be accepted.

JK

@Mr-Swiss
Copy link

There is but one thing that bothers me currently, with respect to INDEXING.
Currently by default in FB (as in C), all indexing (not just array's) starts with 0 (NULL/zero).
(Except if explicitly stated, that another LBound() is to be used.)
Therefore, your referencing "starts with 1" is not really logical (and therefore confusing).
Whould you please elaborate on the issue?

@jklwn
Copy link
Author

jklwn commented Sep 11, 2019

Well, i have three reasons for starting at 1 for the "linear index"

1.) It´s important to understand that a "linear index" is a different thing than a regular index.
Both do the same thing but in different ways. Taking the actual memory layout of FB arrays
any array (even multi-dimensional) could be seen as a one dimensional array with a lbound of 1.
That is, if you have a one dimensional array with a lbound of 1, the linear index corresponds
to the regular index (1 to ...). In all other cases (more than one dimension, different
lbound) both are different.
If the linear index defaulted to zero like lbound, it would mask the difference and
i fear most users wouldn´t get, that there is a difference at all.

2.) With array(scan, ...) i need a return value for "not found". Zero seems to be the most logical
value. When counting elements of a list, an array, an enumeration, etc. the first element is usually
referred to as "first" (the first character in alphabet, the first man on the moon, etc.).
INSTR(), MID(), index characters the same way, the first character gets index 1. The reason why
lbound defaults to 0 is, that math is simpler and faster with a index of zero for the first element.

3.) lbound and ubound are defined as "INTEGER", which means an array´s index can cover the whole
range of an integer variable. So my result must cover the range of an integer type too. But then
which value should consistently represent "not found"? &HFFFFFFFF is not an option, because this
would break 32/64bit compatibility, having &HFFFFFFFF for "not found" in 32 bit and
&HFFFFFFFFFFFFFFFF in 64 bit is confusing and ugly.
Making the result an "UINTEGER" solves this problem, i can have zero for "not found" in 32 and
64 bit and the rest of the range (of integer size) for valid (one based) results.

So in case your array is not one dimensional or it´s lbound is not 1, you must use
<array_index> = (index, array, ) for conversion.

"Array_index" is defined in array.bi as

type array_index 'return type for array(index, ...)
n as integer '# of valid index entries, 0 -> no indices given

li as integer 'linear index (one based), 0 = invalid
i1 as integer 'index 1
i2 as integer '...
i3 as integer
i4 as integer
i5 as integer
i6 as integer
i7 as integer
i8 as integer
end type

JK

jklwn added 3 commits December 3, 2019 22:24
…t open bracket, adapt array.bi. this allows for arrays to be specified with and without brackets (a - a() both are valid)
@jklwn
Copy link
Author

jklwn commented Dec 4, 2019

I changed the allowed syntax to accept array parameters with and without brackets. The prefererred syntax should be "array()" (with brackets) for clarity.

@jklwn jklwn closed this Jul 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants