Use case: counting the average number of consecutive days BTC closes below its SMA50D #27

s-leroux · 2024-04-07T15:59:51Z

Implement the necessary code and documentation to cover the following use case:

We want to know the average number of consecutive days BTC closes below its SMA50D.

s-leroux · 2024-04-08T08:22:25Z

A possible solution:

quote.select(
    (fc.named("SMA"), fc.sma(50), "CLOSE"),
    (fc.named("COMP"), fc.lt, "CLOSE", "SMA"),
).group_by(
    ("COMP",),
    (f.cnamed("COUNT"), ag.count, "COMP"),
).select(
    (fc.ne, fc.constant(0), "COMP"),
).group_by(
    (fc.named("AVG"), ag.avg, "COUNT"),
)

This requires:

A group_by function with a different semantic from SQL, where rows are grouped by a consecutive sequence of values matching a condition.
A set of aggregate functions (ag.count, ag.avg).

Interestingly, tables defined in seq v0.1.0 had a group_by method. We may look at that for inspiration.

s-leroux · 2024-04-08T08:52:17Z

Interestingly, tables defined in seq v0.1.0 had a group_by method. We may look at that for inspiration.

fin/fin/seq/table.py

Lines 165 to 193 in 9781eeb

    
               def group(self, expr, aggregate_functions = {}): 
        
                   """ 
        
                   Return a new table with consecutive rows producing the same value for expr() 
        
                   grouped together. 
        
                   """ 
        
                   def default_mapping(values): 
        
                       return values[0] 
        
                   def aggregate(dest, src): 
        
                       row = [aggregate_functions.get(name, default_mapping)(column) for name, *column in zip(names, *src)] 
        
                       dest.append(row) 
        
                   names = self.names() 
        
                   rows = self.row_iterator() 
        
                   keys, = self.reval(expr) 
        
                   prev = object() 
        
                   group = [] 
        
                   result = [] 
        
                   for key, row in zip(keys, rows): 
        
                       if key != prev: 
        
                           prev = key 
        
                           if group: 
        
                               aggregate(result, group) 
        
                           group = [] 
        
                       group.append(row) 
        
                   if group: 
        
                       aggregate(result, group) 
        
                   return table_from_data([*zip(*result)], self.names())

s-leroux · 2024-04-09T23:00:28Z

Given the implementation in c3b851d a possible solution would be:

quote.select(
    "DATE",
    (fc.named("SMA"), fc.sma(50), "CLOSE"),
    (fc.named("COMP"), fc.lt, "CLOSE", "SMA"), # <--- WRONG: `select()` is not recursive. We can't reference columns created in the same statement
).group_by(
    "COMP",
    (ag.first, "DATE"),
    (ag.count, fc.named("COUNT"), "COMP"),
).where(
    (fc.ne, fc.constant(0), "COMP"),
).group_by(
    fc.constant(1),
    (ag.avg, fc.named("AVG"), "COUNT"),
)

I am not quite sure of the last group_by clause.
~~It is also worth noting that it is rather confusing to omit the DATE column in the select clause because it is implied, while it is mandatory in group_by.~~ (fixed by a5aff7b)
The first select() statement is wrong since we can't reference a column created in the same statement (whereas it is possible in create())

s-leroux · 2024-04-10T10:37:21Z

The first select() statement is wrong since we can't reference a column created in the same statement (whereas it is possible in create())

Possible solution:
Introduce the extend() predicate that works like create() but starts with a copy of the receiver's columns instead of an empty list.

The above statement might be rewritten as:

quote["CLOSE"].extend(
    (fc.named("SMA"), fc.sma(50), "CLOSE"),
    (fc.named("COMP"), fc.lt, "CLOSE", "SMA"),
).group_by(
...

s-leroux · 2024-04-10T18:56:12Z

Possible solution: Introduce the extend() predicate that works like create() but starts with a copy of the receiver's columns instead of an empty list.

Implemented in 03db264.

s-leroux · 2024-04-11T07:55:08Z

...
).where(
    "COMP",
).group_by(
...

We also need a where() predicate to select rows.

The where predicate returns a series similar to the receiver, but containing only the rows satisfying some condition(s).

s-leroux · 2024-04-11T09:50:42Z

We also need a where() predicate to select rows.

Implemented in 1999c03.

s-leroux · 2024-04-11T10:36:35Z

Full use case implemented in ae6efd4 as:

    quote["Close"].extend(
        (fc.named("Sma"), fc.sma(50), "Close"),
        (fc.named("Comp"), fc.lt, "Close", "Sma"),
    ).group_by(
        "Comp",
        (ag.first, "Date"),
        (ag.first, "Comp"),
        (ag.count, fc.named("Count"), "Comp"),
    ).where(
        "Comp",
    ).group_by(
        "Comp",
        (ag.first, "Date"),
        (ag.avg, fc.named("Avg"), "Count"),
    )

- create() - select() - extend() - sort_by() - group_by() - where() See docs/snippets/snippet_counting_001.py for an example. Close #27

s-leroux added documentation Improvements or additions to documentation enhancement New feature or request labels Apr 7, 2024

s-leroux mentioned this issue Apr 8, 2024

Implement Serie.group_by and aggregate functions #30

Closed

s-leroux mentioned this issue Apr 9, 2024

make the index spec mandatory in select #33

Closed

s-leroux closed this as completed in 33864ca Apr 11, 2024

s-leroux added a commit that referenced this issue Apr 11, 2024

All serie predicates are implemented.

d879ca6

- create() - select() - extend() - sort_by() - group_by() - where() See docs/snippets/snippet_counting_001.py for an example. Close #27

s-leroux mentioned this issue Apr 11, 2024

Distinguish between from_sequence and import #29

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use case: counting the average number of consecutive days BTC closes below its SMA50D #27

Use case: counting the average number of consecutive days BTC closes below its SMA50D #27

s-leroux commented Apr 7, 2024

s-leroux commented Apr 8, 2024

s-leroux commented Apr 8, 2024

s-leroux commented Apr 9, 2024 •

edited

Loading

s-leroux commented Apr 10, 2024

s-leroux commented Apr 10, 2024

s-leroux commented Apr 11, 2024 •

edited

Loading

s-leroux commented Apr 11, 2024

s-leroux commented Apr 11, 2024

Use case: counting the average number of consecutive days BTC closes below its SMA50D #27

Use case: counting the average number of consecutive days BTC closes below its SMA50D #27

Comments

s-leroux commented Apr 7, 2024

s-leroux commented Apr 8, 2024

s-leroux commented Apr 8, 2024

s-leroux commented Apr 9, 2024 • edited Loading

s-leroux commented Apr 10, 2024

s-leroux commented Apr 10, 2024

s-leroux commented Apr 11, 2024 • edited Loading

s-leroux commented Apr 11, 2024

s-leroux commented Apr 11, 2024

s-leroux commented Apr 9, 2024 •

edited

Loading

s-leroux commented Apr 11, 2024 •

edited

Loading