Skip to content

Conversation

@dtrifiro
Copy link
Contributor

calling parts() to get name requires a lot of unnecessary splitting, which quickly adds up when dealing with large datasets.

@dtrifiro dtrifiro requested a review from a team as a code owner May 17, 2022 11:48
@dtrifiro dtrifiro requested a review from karajan1001 May 17, 2022 11:48
@dtrifiro

This comment was marked as outdated.

@dtrifiro dtrifiro added the performance improvement over resource / time consuming tasks label May 17, 2022
@efiop
Copy link
Contributor

efiop commented May 17, 2022

Btw, alternative might be to make parts a generator, but that would require a bit more changes around the code base, but probably a long term correct solution. Though might be unnecessary for now.

@dtrifiro dtrifiro marked this pull request as draft May 17, 2022 13:39
@dtrifiro dtrifiro force-pushed the fix/speed-up-path branch from 7aca800 to b0a7448 Compare May 17, 2022 15:51
@dtrifiro dtrifiro changed the title objects.fs: use rsplit in Path.name objects.fs: use split in Path.name May 17, 2022
@efiop efiop force-pushed the fix/speed-up-path branch 4 times, most recently from dd7cec9 to a10269b Compare May 17, 2022 16:58
@efiop efiop force-pushed the fix/speed-up-path branch from a10269b to 37d1914 Compare May 17, 2022 16:59
@efiop efiop marked this pull request as ready for review May 17, 2022 17:06
@efiop efiop merged commit 01cc97f into treeverse:main May 17, 2022
@efiop efiop changed the title objects.fs: use split in Path.name fs: path: use flavour.basename May 17, 2022
@dtrifiro dtrifiro deleted the fix/speed-up-path branch May 18, 2022 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance improvement over resource / time consuming tasks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants