Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Table.add_row_number #6890

Merged
merged 18 commits into from
Jun 2, 2023
Merged

Conversation

radeusgd
Copy link
Member

@radeusgd radeusgd commented May 30, 2023

Pull Request Description

Closes #5227

Important Notes

  • This lays first steps towards Remove overhead of calls from Table java code into Enso code by refactoring the functionality to Enso #6292 - we get pure Enso variants of MultiValueKey.
  • Another part refactors LongStorage into AbstractLongStorage allowing it to provide alternative implementations of the underlying storage, in our case LongRangeStorage generating the values ad-hoc and LongConstantStorage - currently unused but in the future it can be adapted to support constant columns (once we implement similar facilities for other types).

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • All code has been tested:
    • Unit tests have been written where possible.
    • If GUI codebase was changed, the GUI was tested when built using ./run ide build.

@radeusgd
Copy link
Member Author

image

@radeusgd radeusgd force-pushed the wip/radeusgd/6860-add-row-number branch from 0061994 to e52a4b3 Compare May 31, 2023 12:38
import org.enso.table.data.column.storage.DateStorage;
import org.enso.table.data.column.storage.datetime.DateStorage;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved 'classes' of related storages (datetime and numeric) into sub-packages, as the top-level storage package grew quite a bit and I think this way it will be a bit more manageable.

Comment on lines +1 to +15
package org.enso.table.data.column.storage.numeric;

public class LongConstantStorage extends ComputedLongStorage {
private final long constant;

public LongConstantStorage(long constant, int size) {
super(size);
this.constant = constant;
}

@Override
protected long computeItem(int idx) {
return constant;
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class, together with ComputedLongStorage demonstrates how we can implement O(1) storages for constant columns.

At some point we may do a similar refactor for all other storage types (we should be able to share most of the code for the non-primitive storages, only DoubleStorage will also need special handling to avoid unnecessary boxing, like we are doing here) to support the constant columns better.

@radeusgd radeusgd force-pushed the wip/radeusgd/6860-add-row-number branch from e52a4b3 to e7a94c4 Compare May 31, 2023 12:47
Copy link
Member

@jdunkerley jdunkerley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

The start of refactoring the table is great. I'd like to understand performance of Multi_Value_Key in Enso vs Java.

Comment on lines 1291 to 1294
Set_Mode.Update -> if self.java_table.getColumnByName renamed.name . is_nothing . not then True else
Error.throw (Missing_Column.Error renamed.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really part of this but we could do something using get here. Like:

Suggested change
Set_Mode.Update -> if self.java_table.getColumnByName renamed.name . is_nothing . not then True else
Error.throw (Missing_Column.Error renamed.name)
Set_Mode.Update -> if self.get renamed.name if_missing=(Error.throw (Missing_Column.Error renamed.name))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will work, but the other branch that is the reversed condition will not work that easily. And I'd prefer to keep the branches symmetric.

import org.enso.table.data.column.storage.DoubleStorage;
import org.enso.table.data.column.storage.LongStorage;
import org.enso.table.data.column.storage.Storage;
import org.enso.table.data.column.storage.*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid *?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sorry - my formatter sometimes brings these in.

@radeusgd radeusgd force-pushed the wip/radeusgd/6860-add-row-number branch from e7a94c4 to 9a5bc94 Compare June 1, 2023 17:53
@radeusgd radeusgd added the CI: Ready to merge This PR is eligible for automatic merge label Jun 1, 2023
@radeusgd radeusgd force-pushed the wip/radeusgd/6860-add-row-number branch from 9a5bc94 to 7558c56 Compare June 2, 2023 09:22
@mergify mergify bot merged commit d44b125 into develop Jun 2, 2023
@mergify mergify bot deleted the wip/radeusgd/6860-add-row-number branch June 2, 2023 10:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement the add_row_number function Add Table.add_row_number function
3 participants