Array #173

jklwn · 2019-09-04T16:43:16Z

This pull request is ready now for review.

It has four sections (each one split into several commits):

compiler changes and additions mostly for the preprocessor to allow for more flexible macros
compiler additions for the run time library implementation
an additional include file as FreeBASIC front end for new array features
additions to the run time library

There are comments after each section for further information

JK

…s, e.g. "typeof(a())")

…variadic parameter may contain commas, so it may expand to more than one parameter, this also allows for a macro to receive less arguments than specified in the macro´s definition, if there is a variadic parameter.

…ty variadic paramter, the else clause was missing

…efining an already defined key word, the existing definition is not lost, but temporarily superseded by the new definition. The original definition becomes valid again as soon as a scope block is left, or by using #UNDEF, in global scope #UNDEF is mandatory.

…at is, it returns the argument count of a (variadic) parameter. An empty parameter counts as one parameter. This can be extremly useful for processing variadic macro parameters.

… as "#" operator) but as uppercase. This is useful for case independent parsing of macro arguments.

jklwn · 2019-09-04T20:42:24Z

The first section (mostly for the preprocessor to allow for more flexible macros) is ready now.

TYPEOF :
return the type of a variable at run time (previous version worked only in preprocessor i.e. at compile time. Might not be the most elegant solution, a workaround for a missing "variabletype" function. Returning the variable type as string has one major advantage over a "variable type derived from dtype" approach, it allows for keeping apart UDTs by their name. Accepts "()" for array parameters too.

Macro parameter check modification:
the compiler checks the parameter count of macros and throws a compile error, if the actual number of parameters doesn´t match the definition. This check has been disabled in case of a variadic macro parameter. That is, a variadic parameter may now not only be missing at all, but also qualify as one or more parameters. A variadic parameter may have commas inside, which in effect makes it more than one parameter. A macro may now also receive less parameters than specified in it´s definiton.

BUGFIX:
stringize ("#") didn´t work for non-ANSI files and an empty (variadic) parameter. The approriate code was simply missing.

#REDEF :
allows for temporarily re-defining (and thus overriding) an already existing #define (which was forbidden before). This new define is valid only in the current scope and is automatically restored to it´s previous definition when a scope ends. In global scope you may use #UNDEF to restore the previous #define. This allows for otherwise forbidden key words to be used in macros and elsewhere

"###" :
new preprocessor operator, which returns the number of arguments (count of commas + 1) in a variadic macro parameter.

"#&#" :
stringize uppercase, that is a macro parameter is returned as uppercase string - same as "#", but uppercase.

JK

…rayCalcPos, fb_ArrayCalcIdxPos, fb_ArrayCalcIdxPtr, fb_ArrayShift, fb_ArraySort, fb_ArrayAttach, fb_ArrayReset, fb_ArrayScan)

jklwn · 2019-09-04T21:58:17Z

Section two is ready. This adds all new RTL function definitions needed for the new array features

It might still be necessary to apply minor changes to these definitions in rtl-array.bas later on.

JK

…ry definitions and code.

jklwn · 2019-09-05T19:26:09Z

As a third step "array.bi" has been added. This file contains the necessary definitions and code. It will be updated along with the following commits for the RTL in order to activate the provided features.

JK

…ptor

…ompile time errors (#ERROR ...) inside macros is one line off sometimes, #LINE __PREVLINE__ can fix this.

… overlay and for resetting such an array to default state again.

…nter, linear (one based) index and (multidimensional) array indices

…rom an existing array. This doesn´t change (REDIM) the element count of an array

…t, ...) and array(scan, ...) to be complete

…ions for each standard variable type, as well as support for custom sort functions

… literal or expression, custom search functions are suported too.

jklwn · 2019-09-07T12:43:14Z

The last section covers additions to the RTL and corresponding changes to the compiler and "array.bi"

All new features work with all kinds of arrays regardless of the data type. Passed z/wstring arrays don´t work because of an existing bug. Applying #158 to this PR fixes this problem.

Arrays:
you could think of an array as a list of consecutive elements of the same kind in memory. An element could be a standard variable type or a structure (user defined type). An element is accessed by it´s index. The first element gets index 1, the second element gets index 2 and so on. Actually you may define an array to begin at a different index than 1, the compiler is clever enough to do the necessary math for you.

This seems pretty obvious so far, it starts to become a bit more complicated with multidimensional arrays. That is, you must specify more than one index value for accessing an element. The memory layout is the same, but the overlaying logic is different. Nevertheless any multidimensional may be seen as a one dimensional array too.

In fact "DIM a(1 to 8) AS BYTE" allocates the same amount of memory as "DIM b(1 to 2, 1 to 2, 1 to 2) AS BYTE". In case of a() the first element in memory is a(1), the second a(2) and so on, in case of b(), the first element is b(1,1,1), the second element in memory is b(1,1,2), the third is b(1,2,1).

If we look at b() as if it was a one dimensional array with a starting index of 1, we would have a "linear index" of 1 for (1,1,1), of 2 for (1,1,2), 3 for (1,2,1) and so on.

All array features work for multidimensional arrays too. You may specify a certain element by it´s "regular index" (x,y,z,...) or by it´s "linear index". It is important to understand, that array scan always returns the linear index, which is identical to the one dimensional index with a starting index of 1.

Example: for "DIM a(1 to 8) AS BYTE" a linear index of 4 denominates element 4 for "DIM a(10 to 18) AS BYTE" a linear index of 4 denominates element 13 (!)

Syntax:

array(sort, ... no return value

sort an array of arbitrary type, for arrays of UDT´s you must define and use a callback function for sorting, for all standard variable types there is a readymade sorting function in the RTL or you may use a user defined callback, if you like. In general using a callback function is slower, because of the overhead of multiple calls.

The default sorting direction is ascending ("up"), if not specified otherwise. For sorting in descending order you must specify "down".

By default all elements of an array are sorted. You may specify a starting index and a count of elements to sort. If no count is given, all elements up to the end of the array are sorted.

The starting index may be given as index (index value(s) in parenthesis) as usual "(x,y,...)", or a linear index. In case of a linear index, the key word "pos" is required (pos().

You may specify a second array of arbitrary type to be sorted in the same order as the first
array. That is both arrays can be of different types. In this case both arrays must be enclosed
in parenthesis (see example #4).

Arrays of strings (zstring/wstring/string and ustring) may be sorted case insensitive. Default is case sensitive. For a case insensitive sorting you must specify the "nocase" key word in parenthesis with the string array to sort (see example #3 and #4).

array(sort, array)
array(sort, array, down [, (i1 [, i2 [, ...]]) | pos(i), [, count]])
array(sort, (array, nocase))
array(sort, (array1, array2, nocase) [, (i1 [, i2 [, ...]]) | pos(i), [, count]])
array(sort, (array, nocase), down [, (i1 [, i2 [, ...]]) | pos(i), [, count]])
array(sort, array, @customsortproc [, (i1 [, i2 [, ...]]) | pos(i), [, count]])

array(insert, ... no return value

Inserts an element into an existing array by shifting the already present elements up by one. Currently the last element is shifted out and thus deleted. For elements with a destructor, the destructor
is called. You can prevent this by re-dimming the array to an appropriate size before inserting new elements.

"Value" must be of the same variable type as the array itself and is assigned to the given position. You may only insert one element at a time.

Position may be a regular index or a linear index. By default all elements up to the end of the array are shifted. Count restricts this shift operation to the specified number of elements.
The last element covered by "count", is deleted in this case.

array(insert, array, value [, (i1 [, i2 [, ...]]) | pos(i), [, count]])

array(delete, ... no return value

Deletes an element into an existing array by shifting the already present elements down by one. The last is initialized to the data type´s default value. For elements with a constructor, the
default constructor is called. You may prevent a last orphaned element by re-dimming the array
to an appropriate size after deleting elements.

You may only delete one element at a time.

Position may be a regular index or a linear index. By default all elements up to the end of the array are shifted. Count restricts this shift operation to the specified number of elements. The last element covered by "count", is initialized to it´s default in this case.

array(delete, array [, (i1 [, i2 [, ...]]) | pos(i), [, count]])

array(scan, ... returns linear index

Searches an array for a specified value (for(). "Searchterm" must be a compatible type. That is, you may mix all numeric types (except floating point types) and all string types. Obviously not all allowed combination make sense under all circumstances. It´s up to the user to interpret results in correct manner. A value of zero is returned, if searchterm is nor present in the array, otherwise the linear position (!) of the first occurrence is returned.

Position may be a regular index or a linear index. By default all elements up to the end of the array are searched. Count restricts this search operation to the specified number of elements.

Arrays of strings (zstring/wstring/string and ustring) may be searched case insensitive. The default is case sensitive. For a case insensitive sorting you must specify the "nocase" key word in the for() clause (see example #2).

For arrays of UDT´s you must define and use a callback function for searching, for all standard variable types there is a ready-made search function in the RTL. You may use a user defined callback for special searches, if you like. In general using a callback function is slower, because of the overhead of multiple calls.

A return value of 0 may indicate either "not found" or "an error occurred", you must check ERR in this case.

i = array(scan, array, for()[, (i1 [, i2 [, ...]]) | pos(i), [, count]])

for(searchterm [,nocase [,from])
for(searchterm [,from])
for(@customproc)
'***********************************************************************************************

array(, ...

Return the requested value from the array´s descriptor. Some values are already available
either directly or by doing some math. This offers a consistent interface for all values,
which might be useful.

i = array(specifier, array), returns any ptr (1), integer (2,3,4,5) or boolean (6,7,8)
'***********************************************************************************************
data 'pointer to the first array element in memory
dimensions '# of dimensions (same as UBOUND(array, 0)
total_size 'in bytes
total_count 'total # of elements in all dimensions
size 'same as sizeof(array)
is_fixed_len 'fixed len array
is_fixed_dim 'fixed dimension array
is_dynamic 'dynamic array
is_attached 'attached or not

array(, ...

Calculate the linear index form a regular index and vice versa, do the same for a given memory pointer.

This is necessary for retrieving the regular index from the linear index returned by "array(scan, ...", and it is necessary when dealing with multidimensional arrays.

i = array(pos, array, (i1 [, i2 [,...]])) returns linear (one based) position from index
p = array(ptr, array, (i1 [, i2 [,...]])) returns memory ptr for this index
u = array(index, array, pos) returns array_index from linear index
u = array(index, array, ptr) returns array_index

array(attach, ...
array(reset, ...

Redim an array at a specific memory location. You must specify the index (or indices) as usual "REDIM()". No memeory is allocated or deallocated afterwards, instead existing memory is made accessible by an overlay structure in form of an array. This can be useful for all kinds of direct data manipulation.

Array reset erases the array descriptor again without erasing the memory.

This feature requires an empty (not yet dimmed or erased) dynamic array and doesn´t work with
fixed size arrays.

array(attach, array, redim(1 to 5[, 1 to 6 [1, ...]]), memory ptr)
array(reset, array)
'***********************************************************************************************

jklwn · 2019-09-07T13:13:24Z

I didn´t know that Travis fails on warnings ...

This PR is ready now for review. I coded and ran tests, which will have to be ported to the test suite as soon as this PR is in a state, that it will be accepted.

JK

Mr-Swiss · 2019-09-10T12:56:16Z

There is but one thing that bothers me currently, with respect to INDEXING.
Currently by default in FB (as in C), all indexing (not just array's) starts with 0 (NULL/zero).
(Except if explicitly stated, that another LBound() is to be used.)
Therefore, your referencing "starts with 1" is not really logical (and therefore confusing).
Whould you please elaborate on the issue?

jklwn · 2019-09-11T13:24:14Z

Well, i have three reasons for starting at 1 for the "linear index"

1.) It´s important to understand that a "linear index" is a different thing than a regular index.
Both do the same thing but in different ways. Taking the actual memory layout of FB arrays
any array (even multi-dimensional) could be seen as a one dimensional array with a lbound of 1.
That is, if you have a one dimensional array with a lbound of 1, the linear index corresponds
to the regular index (1 to ...). In all other cases (more than one dimension, different
lbound) both are different.
If the linear index defaulted to zero like lbound, it would mask the difference and
i fear most users wouldn´t get, that there is a difference at all.

2.) With array(scan, ...) i need a return value for "not found". Zero seems to be the most logical
value. When counting elements of a list, an array, an enumeration, etc. the first element is usually
referred to as "first" (the first character in alphabet, the first man on the moon, etc.).
INSTR(), MID(), index characters the same way, the first character gets index 1. The reason why
lbound defaults to 0 is, that math is simpler and faster with a index of zero for the first element.

3.) lbound and ubound are defined as "INTEGER", which means an array´s index can cover the whole
range of an integer variable. So my result must cover the range of an integer type too. But then
which value should consistently represent "not found"? &HFFFFFFFF is not an option, because this
would break 32/64bit compatibility, having &HFFFFFFFF for "not found" in 32 bit and
&HFFFFFFFFFFFFFFFF in 64 bit is confusing and ugly.
Making the result an "UINTEGER" solves this problem, i can have zero for "not found" in 32 and
64 bit and the rest of the range (of integer size) for valid (one based) results.

So in case your array is not one dimensional or it´s lbound is not 1, you must use
<array_index> = (index, array, ) for conversion.

"Array_index" is defined in array.bi as

type array_index 'return type for array(index, ...)
n as integer '# of valid index entries, 0 -> no indices given

li as integer 'linear index (one based), 0 = invalid
i1 as integer 'index 1
i2 as integer '...
i3 as integer
i4 as integer
i5 as integer
i6 as integer
i7 as integer
i8 as integer
end type

JK

…ew pp operator (#&# and #?)

…t open bracket, adapt array.bi. this allows for arrays to be specified with and without brackets (a - a() both are valid)

jklwn · 2019-12-04T14:27:19Z

I changed the allowed syntax to accept array parameters with and without brackets. The prefererred syntax should be "array()" (with brackets) for clarity.

jklwn added 6 commits September 4, 2019 16:08

Make TYPEOF work at run time too, allow parenthesis syntax (for array…

9a440ca

…s, e.g. "typeof(a())")

BUGFIX: stringize in hLoadMacroW didn´t return an empty string an emp…

0f6b376

…ty variadic paramter, the else clause was missing

Add "###" preprocessor operator, returns the number of commas + 1. Th…

7cf05cf

…at is, it returns the argument count of a (variadic) parameter. An empty parameter counts as one parameter. This can be extremly useful for processing variadic macro parameters.

Add "#&#" preprocessor operator, returns an argument stringized (same…

bef2a81

… as "#" operator) but as uppercase. This is useful for case independent parsing of macro arguments.

Add new RTL function definitions to the compiler (fb_ArrayDesc, fb_Ar…

e4dee2a

…rayCalcPos, fb_ArrayCalcIdxPos, fb_ArrayCalcIdxPtr, fb_ArrayShift, fb_ArraySort, fb_ArrayAttach, fb_ArrayReset, fb_ArrayScan)

Add "array.bi" as FreeBASIC front end. This file contains the necessa…

3288585

…ry definitions and code.

jklwn added 9 commits September 5, 2019 21:54

Add definitions to fb_array.h. Minor changes might still be necessary.

3d0400b

Add "array_desc.c", return various information from an array´s descri…

db99ecd

…ptor

Add new compiler intrinsic "__PREVLINE__" (__line__ - 1). Reporting c…

829acab

…ompile time errors (#ERROR ...) inside macros is one line off sometimes, #LINE __PREVLINE__ can fix this.

Add "array_attach.c", allows for creating a dynamic array as a memory…

d087ddd

… overlay and for resetting such an array to default state again.

Add "array_calc.c", do calculations for conversion between memory poi…

d74547e

…nter, linear (one based) index and (multidimensional) array indices

Add "array_shift.c", allows for inserting/deleting one element into/f…

a00b0b6

…rom an existing array. This doesn´t change (REDIM) the element count of an array

Add "ustring.bi", a dynamic wide string type, need this for array(sor…

2a7c69c

…t, ...) and array(scan, ...) to be complete

Add "array_sort.c", provides dedicated and speed optimized sort funct…

140328f

…ions for each standard variable type, as well as support for custom sort functions

Add "array_scan.c", search an array for a numeric or string variable,…

78c3efc

… literal or expression, custom search functions are suported too.

Removed warnings "signed - unsigned" when compiling the RTL

ab9017e

jklwn added 3 commits December 3, 2019 22:24

add macro without brackets feature

3426d24

adapt pp operators to PR#191 (change ### to #% and #&# to ###), add n…

6fb3258

…ew pp operator (#&# and #?)

add new pp operator "#\#" split parameter into two parameters at firs…

c8869a1

…t open bracket, adapt array.bi. this allows for arrays to be specified with and without brackets (a - a() both are valid)

replace fb_hStrCopy with "SHIFT" alias fb_MemMove

5109eb7

jklwn closed this Jul 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Array #173

Array #173

Uh oh!

jklwn commented Sep 4, 2019 •

edited

Loading

Uh oh!

jklwn commented Sep 4, 2019 •

edited

Loading

Uh oh!

jklwn commented Sep 4, 2019

Uh oh!

jklwn commented Sep 5, 2019

Uh oh!

jklwn commented Sep 7, 2019 •

edited

Loading

Uh oh!

jklwn commented Sep 7, 2019 •

edited

Loading

Uh oh!

Mr-Swiss commented Sep 10, 2019

Uh oh!

jklwn commented Sep 11, 2019 •

edited

Loading

Uh oh!

jklwn commented Dec 4, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Array #173

Array #173

Uh oh!

Conversation

jklwn commented Sep 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jklwn commented Sep 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jklwn commented Sep 4, 2019

Uh oh!

jklwn commented Sep 5, 2019

Uh oh!

jklwn commented Sep 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jklwn commented Sep 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mr-Swiss commented Sep 10, 2019

Uh oh!

jklwn commented Sep 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jklwn commented Dec 4, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jklwn commented Sep 4, 2019 •

edited

Loading

jklwn commented Sep 4, 2019 •

edited

Loading

jklwn commented Sep 7, 2019 •

edited

Loading

jklwn commented Sep 7, 2019 •

edited

Loading

jklwn commented Sep 11, 2019 •

edited

Loading