Skip to content

Commit

Permalink
Improve non-trivial datatypes (#1808)
Browse files Browse the repository at this point in the history
Move the storage of rank and order from `ast.basic.TypedAstNode` to an
internal property of `ast.datatypes.PyccelType` objects (the properties
are still exposed via `ast.basic.TypedAstNode`). This allows mixed
ordered objects to be described and therefore created (e.g.
`list[float[:,:](order=F)]`). A test is added for this case. This change
should make #1583 simple to implement.

Making the rank and order part of the type fixes #1821 . This provides
enough information to allow lists of arrays. Fixes #1721.

In order to correctly index types such as `list[float[:,:](order=F)]`
the creation of an `IndexedElement` in the syntactic stage is modified.
Instead of collapsing all the indices into one `IndexedElement` an
`IndexedElement` is created for each `ast.Subscript`. In the semantic
stage this is collapsed into one `IndexedElement` for each container.

E.g. for the following code:
```python
import numpy as np
a = (np.ones((3,4)), np.zeros((3,4)))
a[0][1][2]
```
On the devel branch after the syntactic stage the last line is described
as `IndexedElement('a', (0, 1, 2))`. This persists to the codegen stage.
On this branch after the syntactic stage the last line is described as
`IndexedElement(IndexedElement(IndexedElement('a', (0,)), (1,)), (2,))`.
After the semantic stage it becomes `IndexedElement(IndexedElement('a',
(0,)), (1,2))`. The printers are modified to handle this. The generated
code is not changed (as support for lists has not yet been added) so the
indices are still used together to index a 3D array.

**Commit Summary**

- Update documentation
- Add missing `if __name__ == '__main__'` on old documentation examples.
- Remove `static_rank` and `static_order` properties from
`pyccel.ast.basic`. These properties are now contained in `static_type`.
- Remove `_rank` and `_order` properties from `TypedAstNode`s.
- Change `PyccelOperator.set_shape_rank` functions to
`PyccelOperator.set_shape` functions (the rank must now be determined at
the same time as the type).
- Remove `PyccelOperator._set_order` functions.
- Simplify homogeneity checks for tuples, lists and sets
- Ensure the shape of an empty list/tuple literal is set correctly.
- Correct the class type of `CmathPolar`
- Remove the `order` argument from the `Allocate` constructor (the
information is now retrieved from the order of the allocated variable).
- Add the properties `rank` and `order` to
`pyccel.ast.datatypes.PyccelType`.
- Add a `switch_rank` function to
`pyccel.ast.datatypes.HomogeneousContainerType`.
- Add a `container_rank` property to
`pyccel.ast.datatypes.HomogeneousContainerType`.
- Parameterise `NumpyNDArrayType` by the rank and the order
- Change the argument of `NumpyNewArray` from `dtype` to `class_type`
(to set the rank/order in the subclass).
- Allow `NumpyArray` to handle mixed rank objects (e.g. list of
F-ordered arrays).
- Rename `NumpyUfuncBase._set_shape_rank` to
`NumpyUfuncBase._get_shape_rank` and return the shape and rank to avoid
saving the rank unnecessarily.
- Rename `NumpyUfuncBase._set_order` to `NumpyUfuncBase._get_order` and
pass the rank and return the order to avoid saving unnecessarily.
- Add docstrings.
- Add a `NumpyNDArrayType.__new__` function which redirects rank 0
arrays to scalar types.
- Augment `NumpyNDArrayType.__add__` to handle rank and ordering.
- Add a `NumpyNDArrayType.swap_order` function to change between C and F
ordered equivalent types.
- Add `__str__` or `__repr__` functions to `PyccelType`s such that
printing them gives a valid type annotation.
- Add a `NumpyInt` object to easily find the NumPy integer which has the
same precision as Python integers.
- Remove `rank` and `order` arguments from
`pyccel.ast.type_annotations.VariableTypeAnnotation` constructor
(information now available in `class_type`).
- Remove `order` property from
`pyccel.ast.type_annotations.VariableTypeAnnotation` (`rank` is retained
for now).
- Update vector expression unravelling functions to handle multiple
levels of `IndexedElement`s.
- Improve docstrings in `pyccel.ast.utilities`.
- Remove `rank` and `order` arguments from
`pyccel.ast.variable.Variable` constructor (the information is now
retrieved from the `class_type`).
- Remove unused property `is_stack_scalar`.
- Simplify `is_ndarray` method.
- Add `_is_slice` attribute to `IndexedElement` to indicate if an
element or a slice of the base is represented.
- Add `allows_negative_indexes` property to `IndexedElement`.
- Update `_print_IndexedElement` to handle multi-level
`IndexedElement`s.
- Correct `abs` call in `_print_NumpyNorm`
- Improve error message for wrong arguments.
- Allocate strings on the stack to avoid calling `Allocate` (to be
improved with #459 ).
- Remove `get_type_description` (this is now handled by
`PyccelType.__str__`).
- Provide a traceback to `errors.report` to allow the location of a
`TypeError` raised during a `FunctionCall` to be more easily located.
- Stop collapsing `ast.Subscript` into one `IndexedElement` object.
- Add some mixed ordered tests.
- Disable tests with ambiguous interfaces (see #1821).
  • Loading branch information
EmilyBourne committed Apr 16, 2024
1 parent 1390622 commit e36726f
Show file tree
Hide file tree
Showing 42 changed files with 1,176 additions and 875 deletions.
2 changes: 2 additions & 0 deletions .dict_custom.txt
Expand Up @@ -113,3 +113,5 @@ setter
bitwise
datatyping
datatypes
indexable
traceback
7 changes: 7 additions & 0 deletions CHANGELOG.md
Expand Up @@ -14,6 +14,8 @@ All notable changes to this project will be documented in this file.
- #1750 : Add Python support for set method `remove()`.
- #1787 : Ensure `STC` is installed with Pyccel.
- #1743 : Add Python support for set method `discard()`.
- \[INTERNALS\] Added `container_rank` property to `ast.datatypes.PyccelType` objects.
- \[DEVELOPER\] Added an improved traceback to the developer-mode errors for errors in function calls.

### Fixed

Expand All @@ -29,6 +31,7 @@ All notable changes to this project will be documented in this file.
- #1785 : Add missing cast when creating an array of booleans from non-boolean values.
- #1218 : Fix bug when assigning an array to a slice in Fortran.
- #1830 : Fix missing allocation when returning an annotated array expression.
- #1821 : Ensure an error is raised when creating an ambiguous interface.

### Changed
- #1720 : functions with the `@inline` decorator are no longer exposed to Python in the shared library.
Expand All @@ -39,11 +42,15 @@ All notable changes to this project will be documented in this file.
- \[INTERNALS\] Build `utilities.metaclasses.ArgumentSingleton` on the fly to ensure correct docstrings.
- \[INTERNALS\] Rewrite datatyping system. See #1722.
- \[INTERNALS\] Moved precision from `ast.basic.TypedAstNode` to an internal property of `ast.datatypes.FixedSizeNumericType` objects.
- \[INTERNALS\] Moved rank from `ast.basic.TypedAstNode` to an internal property of `ast.datatypes.PyccelType` objects.
- \[INTERNALS\] Moved order from `ast.basic.TypedAstNode` to an internal property of `ast.datatypes.PyccelType` objects.
- \[INTERNALS\] Use cached `__add__` method to determine result type of arithmetic operations.
- \[INTERNALS\] Use cached `__and__` method to determine result type of bitwise comparison operations.
- \[INTERNALS\] Removed unused `fcode`, `ccode`, `cwrappercode`, `luacode`, and `pycode` functions from printers.
- \[INTERNALS\] Removed unused arguments from methods in `pyccel.codegen.codegen.Codegen`.
- \[INTERNALS\] Stop storing `FunctionDef`, `ClassDef`, and `Import` objects inside `CodeBlock`s.
- \[INTERNALS\] Remove the `order` argument from the `pyccel.ast.core.Allocate` constructor.
- \[INTERNALS\] Remove `rank` and `order` arguments from `pyccel.ast.variable.Variable` constructor.

### Deprecated

Expand Down
8 changes: 0 additions & 8 deletions developer_docs/ast_nodes.md
Expand Up @@ -54,14 +54,6 @@ The order indicates how an array is laid out in memory. This can either be row-m

The static type is the class type that would be assigned to an object created using an instance of this class as a type annotation.

### Static rank

The static rank is the rank that would be assigned to an object created using an instance of this class as a type annotation.

### Static order

The static order is the order that would be assigned to an object created using an instance of this class as a type annotation.

## Pyccel Internal Function

The class `pyccel.ast.internals.PyccelInternalFunction` is a super class. This class should never be used directly but provides functionalities which are common to certain AST objects. These AST nodes are those which describe functions which are supported by Pyccel. For example it is used for functions from the `math` library, the `cmath` library, the `numpy` library, etc. `PyccelInternalFunction` inherits from `TypedAstNode`. The type information for the sub-class describes the type of the result of the function.
Expand Down
6 changes: 5 additions & 1 deletion developer_docs/type_inference.md
Expand Up @@ -68,7 +68,7 @@ The and operator describes what happens when two numeric types are combined in a
When using these operators on an unknown number of arguments it can be useful to use `NativeGeneric()` as a starting point for the sum.

#### Container Type
A `ContainerType` is an object which is comprised of `FixedSizeType` objects (e.g. `ndarray`,`list`,`tuple`, custom class). The sub-class `HomogeneousContainerType` describes containers which contain homogeneous data. These objects are characterised by an `element_type`. The elements of a `HomogeneousContainerType` are instances of `PyccelType`, but they can be either `FixedSizeType`s or `ContainerType`s.
A `ContainerType` is an object which is comprised of `FixedSizeType` objects (e.g. `ndarray`,`list`,`tuple`, custom class). The sub-class `HomogeneousContainerType` describes containers which contain homogeneous data. These objects are characterised by an `element_type`, a `rank` (and associated `container_rank`) and an `order`. The elements of a `HomogeneousContainerType` are instances of `PyccelType`, but they can be either `FixedSizeType`s or `ContainerType`s. The `container_rank` is an integer equal to the number of indices necessary to index the container and get an element of type `element_type`. As these elements may also be indexable the `rank` property allows us to get the number of indices necessary to obtain a scalar element. It is the sum of all the `container_rank`s in the nested types. The `order` specifies the order in which the indices should be used to index the object. This is discussed in detail in the [order docs](./order_docs.md).

`HomogeneousContainerType`s also contain some utility functions. They implement `primitive_type` and `precision` to get the properties of the internal `FixedSizeType` (even if that type is inside another `HomogeneousContainerType`). They also implement `switch_basic_type` which creates a new `HomogeneousContainerType` which is similar to the current `HomogeneousContainerType`. The only difference is that the `FixedSizeType` is exchanged. This is useful when we want to preserve information about the container but need to change the type. For example, when we divide an integer by another we get a floating point type. When we divide a NumPy array or a CuPy array of integers by an integer (or array of integers) we get a NumPy/CuPy array of floating point numbers (with default Python precision). In order to preserve the container type we therefore call `switch_basic_type`. So for the division in the case of NumPy arrays, we want to change the type from `np.ndarray[int]` to `np.ndarray[float]`. This is done in one line:
```python
Expand All @@ -89,4 +89,8 @@ for container in new_container_types:

The `switch_basic_type` cannot be implemented generally in `PyccelType` as there is no logical interpretation for an inhomogeneous `ContainerType`, however the function is also implemented (as the identity function) for `FixedSizeType`s so `switch_basic_type` can be used without the need for type checks (generally inhomogeneous containers will not be valid arguments to classes which may need to use the `switch_basic_type` function).

`HomogeneousContainerType`s also contain a `switch_rank` function. This function is similar to `switch_basic_type` in that it is used to obtain a type which is similar in all but one characteristic. It is usually used to reduce the rank of an object, for example when calculating the type of a slice, however in the future it can also be used to increase the size of the type (e.g. to implement `np.newaxis`), in this case an order may need to be provided to add additional context. Increasing the rank is only possible for multi-dimensional types (e.g. `NumpyNDArrayType`) however the rank can be decreased for any `ContainerType`. If the rank is reduced by more than the `container_rank`, this function is called recursively.

For multi-dimensional `HomogeneousContainerType`s (e.g. `NumpyNDArrayType`) the function `swap_order` is also implemented. This inverts the ordering, changing from 'C' to 'F' or 'F' to 'C' if the rank is greater than 1.

In order to access the internal `FixedSizeType`, `PyccelType` also implements a `datatype` property. This makes more sense in the case of a `HomogeneousContainerType` however it is also implemented (as the identity function) for `FixedSizeType`s so the low-level type can be obtained without the need for type checks.
110 changes: 88 additions & 22 deletions docs/ndarrays.md
Expand Up @@ -22,9 +22,10 @@ Generally a variable in Pyccel should always keep its initial type, this also tr
```Python
import numpy as np

a = np.array([1, 2, 3], dtype=float)
#(some code...)
a = np.array([1, 2, 3], dtype=int)
if __name__ == '__main__':
a = np.array([1, 2, 3], dtype=float)
#(some code...)
a = np.array([1, 2, 3], dtype=int)
```

_OUTPUT_ :
Expand All @@ -46,9 +47,10 @@ Pyccel calls its own garbage collector when needed, but has a set of rules to do
```Python
import numpy as np

a = np.ones((10, 20))
#(some code...)
a = np.ones(10)
if __name__ == '__main__':
a = np.ones((10, 20))
#(some code...)
a = np.ones(10)
```

_OUTPUT_ :
Expand All @@ -66,9 +68,10 @@ This limitation is due to the fact that the rank of Fortran allocatable objects
```Python
import numpy as np

a = np.array([1, 2, 3, 4, 5])
b = np.array([1, 2, 3])
a = b
if __name__ == '__main__':
a = np.array([1, 2, 3, 4, 5])
b = np.array([1, 2, 3])
a = b
```

_OUTPUT_ :
Expand Down Expand Up @@ -139,10 +142,11 @@ This limitation is due to the fact that the rank of Fortran allocatable objects
```Python
import numpy as np

a = np.ones(10)
b = a[:5]
#(some code...)
a = np.zeros(20)
if __name__ == '__main__':
a = np.ones(10)
b = a[:5]
#(some code...)
a = np.zeros(20)
```

_OUTPUT_ :
Expand All @@ -157,7 +161,7 @@ This limitation is set since we need to free the previous data when we reallocat

### Slicing and indexing ###

The indexing and slicing in Pyccel handles only the basic indexing of [numpy arrays](https://numpy.org/doc/stable/user/basics.indexing.html).
The indexing and slicing in Pyccel handles only the basic indexing of [numpy arrays](https://numpy.org/doc/stable/user/basics.indexing.html). When multiple indexing expressions are used on the same variable Pyccel squashes them into one object. This means that we do not handle multiple slice indices applied to the same variable (e.g. `a[1::2][2:]`). This is not recommended anyway as it makes code hard to read.

Some examples:

Expand All @@ -166,8 +170,9 @@ Some examples:
```Python
import numpy as np

a = np.array([1, 3, 4, 5])
a[0] = 0
if __name__ == '__main__':
a = np.array([1, 3, 4, 5])
a[0] = 0
```

- C equivalent:
Expand Down Expand Up @@ -211,8 +216,9 @@ Some examples:
```Python
import numpy as np

a = np.ones((10, 20))
b = a[2:, :5]
if __name__ == '__main__':
a = np.ones((10, 20))
b = a[2:, :5]
```

- C equivalent:
Expand Down Expand Up @@ -257,10 +263,11 @@ Some examples:
```Python
import numpy as np

a = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
b = a[1]
c = b[2]
print(c)
if __name__ == '__main__':
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
b = a[1]
c = b[2]
print(c)
```

- C equivalent:
Expand Down Expand Up @@ -311,6 +318,65 @@ Some examples:
end program prog_ex
```

- Python code:

```Python
import numpy as np

if __name__ == '__main__':
a = np.array([1, 2, 3, 4, 5, 6, 7, 8])
b = a[1::2][2]
print(b)
```

- C equivalent:

```C
#include <stdlib.h>
#include "ndarrays.h"
#include <stdint.h>
#include <string.h>
#include <stdio.h>
#include <inttypes.h>
int main()
{
t_ndarray a = {.shape = NULL};
int64_t b;
a = array_create(1, (int64_t[]){INT64_C(8)}, nd_int64, false, order_c);
int64_t Dummy_0000[] = {INT64_C(1), INT64_C(2), INT64_C(3), INT64_C(4), INT64_C(5), INT64_C(6), INT64_C(7), INT64_C(8)};
memcpy(&a.nd_int64[INT64_C(0)], Dummy_0000, 8 * a.type_size);
b = GET_ELEMENT(a, nd_int64, INT64_C(5));
printf("%"PRId64"\n", b);
free_array(&a);
return 0;
}
```

- Fortran equivalent:

```Fortran
program prog_prog_tmp_index

use tmp_index

use, intrinsic :: ISO_C_Binding, only : i64 => C_INT64_T
use, intrinsic :: ISO_FORTRAN_ENV, only : stdout => output_unit
implicit none

integer(i64), allocatable :: a(:)
integer(i64) :: b

allocate(a(0:7_i64))
a = [1_i64, 2_i64, 3_i64, 4_i64, 5_i64, 6_i64, 7_i64, 8_i64]
b = a(5_i64)
write(stdout, '(I0)', advance="yes") b
if (allocated(a)) then
deallocate(a)
end if

end program prog_prog_tmp_index
```

## NumPy [ndarray](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html) functions/properties progress in Pyccel ##

- Supported [types](https://numpy.org/devdocs/user/basics.types.html):
Expand Down
33 changes: 2 additions & 31 deletions pyccel/ast/basic.py
Expand Up @@ -534,7 +534,7 @@ def rank(self):
Number of dimensions of the object. If the object is a scalar then
this is equal to 0.
"""
return self._rank # pylint: disable=no-member
return self.class_type.rank

@property
def dtype(self):
Expand All @@ -557,7 +557,7 @@ def order(self):
('F') format. This is only relevant if rank > 1. When it is not relevant
this function returns None.
"""
return self._order # pylint: disable=no-member
return self.class_type.order

@property
def class_type(self):
Expand All @@ -570,33 +570,6 @@ def class_type(self):
"""
return self._class_type # pylint: disable=no-member

@classmethod
def static_rank(cls):
"""
Number of dimensions of the object.
Number of dimensions of the object. If the object is a scalar then
this is equal to 0.
This function is static and will return an AttributeError if the
class does not have a predetermined rank.
"""
return cls._rank # pylint: disable=no-member

@classmethod
def static_order(cls):
"""
The data layout ordering in memory.
Indicates whether the data is stored in row-major ('C') or column-major
('F') format. This is only relevant if rank > 1. When it is not relevant
this function returns None.
This function is static and will return an AttributeError if the
class does not have a predetermined order.
"""
return cls._order # pylint: disable=no-member

@classmethod
def static_type(cls):
"""
Expand Down Expand Up @@ -625,8 +598,6 @@ def copy_attributes(self, x):
The node from which the attributes should be copied.
"""
self._shape = x.shape
self._rank = x.rank
self._order = x.order
self._class_type = x.class_type


Expand Down
2 changes: 1 addition & 1 deletion pyccel/ast/bind_c.py
Expand Up @@ -341,7 +341,7 @@ def __init__(self, var, original_res_var, scope, **kwargs):
name = original_res_var.name
self._shape = [scope.get_temporary_variable(PythonNativeInt(),
name=f'{name}_shape_{i+1}')
for i in range(original_res_var._rank)]
for i in range(original_res_var.rank)]
self._original_res_var = original_res_var
super().__init__(var, **kwargs)

Expand Down
10 changes: 4 additions & 6 deletions pyccel/ast/bitwise_operators.py
Expand Up @@ -86,8 +86,6 @@ class PyccelBitOperator(PyccelOperator):
The second argument passed to the operator.
"""
_shape = None
_rank = 0
_order = None
__slots__ = ('_class_type',)

def __init__(self, arg1, arg2):
Expand Down Expand Up @@ -130,12 +128,12 @@ def _calculate_type(self, arg1, arg2):

return class_type

def _set_shape_rank(self):
def _set_shape(self):
"""
Set the shape and rank of the resulting object.
Set the shape of the resulting object.
Set the shape and rank of the resulting object. For a PyccelBitOperator,
the shape and rank are class attributes so nothing needs to be done.
Set the shape of the resulting object. For a PyccelBitOperator,
the shape is a class attribute so nothing needs to be done.
"""

def __repr__(self):
Expand Down

0 comments on commit e36726f

Please sign in to comment.