Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove duplicates in list_versions #784

Merged
merged 12 commits into from Nov 8, 2019

Conversation

c4ffein
Copy link
Contributor

@c4ffein c4ffein commented Nov 5, 2019

Updated list_versions algorithm to ignore both what can be detected as a mocked manifest, and redundant entries gotten from updates of the manifests constituting the parent path.
Resolves #766

…s a mocked manifest, and redundant entries gotten from updates of the manifests constituting the parent path
@codecov
Copy link

codecov bot commented Nov 5, 2019

Codecov Report

Merging #784 into master will increase coverage by 0.58%.
The diff coverage is 88.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #784      +/-   ##
==========================================
+ Coverage    47.3%   47.88%   +0.58%     
==========================================
  Files         258      258              
  Lines       21363    21902     +539     
==========================================
+ Hits        10105    10487     +382     
- Misses      11258    11415     +157
Impacted Files Coverage Δ
parsec/core/gui/file_history_dialog.py 0% <0%> (ø) ⬆️
parsec/core/gui/versions_table.py 0% <0%> (ø) ⬆️
parsec/core/gui/lang.py 0% <0%> (ø) ⬆️
parsec/core/fs/workspacefs/workspacefs.py 97.94% <100%> (+1.78%) ⬆️
parsec/core/fs/workspacefs/versioning_helpers.py 97.67% <100%> (+1.47%) ⬆️
parsec/core/mountpoint/winfsp_operations.py 23.64% <0%> (-5.1%) ⬇️
parsec/core/fs/remote_loader.py 75.8% <0%> (-0.81%) ⬇️
parsec/core/gui/central_widget.py 0% <0%> (ø) ⬆️
parsec/core/gui/app.py 0% <0%> (ø) ⬆️
parsec/api/protocol/types.py 100% <0%> (ø) ⬆️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 71e62e5...b98c79c. Read the comment docs.

parsec/core/fs/workspacefs/versioning_helpers.py Outdated Show resolved Hide resolved
parsec/core/fs/workspacefs/versioning_helpers.py Outdated Show resolved Hide resolved
parsec/core/fs/workspacefs/versioning_helpers.py Outdated Show resolved Hide resolved
):
previous = item
continue
new_list += [previous]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new_list.append(previous) ?

parsec/core/fs/workspacefs/versioning_helpers.py Outdated Show resolved Hide resolved
parsec/core/fs/workspacefs/workspacefs.py Outdated Show resolved Hide resolved
tests/core/fs/workspacefs_timestamped/test_workspacefs.py Outdated Show resolved Hide resolved
@@ -114,6 +160,20 @@
assert versions_list[7][1] == ((alice.device_id, Pendulum(2000, 1, 13), True, None), None, None)


@pytest.mark.trio
async def test_versions_existing_directory_no_remove_mocked(alice_workspace, alice):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tu peux utiliser parametrize pour regrouper les deux tests

@pytest.mark.trio
@pytest.mark.parametrize('remove_supposed_mock', (False, True))
async def test_versions_existing_directory(alice_workspace, alice, remove_supposed_mock):
    versions = await alice_workspace.versions(FsPath("/files"), remove_supposed_mock=remove_supposed_mock)
    versions_list = list(versions.items())
    [...]

and previous[1][0][3] == 0 # Previous is empty, is a file (would be None for dir)
and previous[1][0][2] == item[1][0][2] # Previous and item are of the same type
and previous[0][3] == item[0][2] # Previous and current are continuous in time
and previous[0][3] < previous[0][2].add(seconds=30) # Check 30 seconds time frame
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pourquoi 30s ? il faudrait au moins en faire une constante (enfin un truc en majuscule quoi ^^) en haut de fichier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fait

for item in sorted(
list(return_tree.items()), key=lambda item: (item[0][3], item[0][0], item[0][1])
)
}
]
# Remove duplicates from father updated and empty manifests set before parents for consistency
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cette boucle est très compliquée, il faudrait expliquer précisément à quoi elle sert, ainsi que la logique de chacun des if menant à un continue

@touilleMan
Copy link
Member

et il manque un newsfragment ^^

Copy link
Member

@touilleMan touilleMan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

En passant dans le code, j'ai vu que tu faisais des conversion de datetime en heure locale (dans ts_ws_dialog.py).
Il faut utiliser parsec.core.gui.lang.format_datetime pour afficher les dates à l'utilisateur car cette fonction prend en compte à la fois le fuseau horaire et le format d'affichage de l'heure en fonction de la locale de l'utilisateur.

SYNC_GUESSED_TIME_FRAME = 30


class TimestampBoundedEntry(NamedTuple):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sympa ! je ne connaissais pas cet usage de NamedTuple ;-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Par contre je me demande si on ne devrait pas remplacer les NamedTuple par des attr.s(slots=True).
L'intérêt étant qu'on ne peut plus accéder aux attributs par index, ce qui fera planter les endroits où il faut modifier le code afin d'améliorer la lisibilité ^^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alors justement, je trouvais ça pratique pour les tests de pouvoir faire des comparaisons partielles de tuple, mais j'y réfléchirai pour la prochaine version

async def test_versions_existing_file_no_remove_minimal_synced(alice_workspace, alice):
versions = await alice_workspace.versions(
FsPath("/files/renamed_content"), remove_supposed_minimal_sync=False
)
versions_list = list(versions.items())

assert len(versions_list) == 6

# Moved /files/content to /files/renamed_content on day 5, moved it again later
assert versions_list[0][0][1:] == (3, Pendulum(2000, 1, 6), Pendulum(2000, 1, 7))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Il faudrait corriger aussi tous ces [0][0][1:] qui manquent de lisibilité pour les remplacer par des accès aux attributs nommés

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C'est déjà plus simple avec ce que j'ai fait, tu préfères que j'explicite tous les champs?

Tuple[EntryID, int, Pendulum, Pendulum],
Tuple[Tuple[DeviceID, Pendulum, bool, int], FsPath, FsPath],
]:
workspacefs, path: FsPath, remove_supposed_minimal_sync: bool = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dans remove_supposed_minimal_sync le remove me semble bizarre: on supprime les minimal sync de la liste qu'on retourne, mais il ne s'agit au final que d'une opération intermédiaire.
Du point de vu de l'appelant on nous fourni une list ne contenant pas de minimal sync, du coup je mettrais plutôt skip_supposed_minimal_sync (voir le skip_minimal_sync que je proposais avant, car j'imaginuqe qu'on peut parler du fait qu'on va faire des suppositions sur ce qui est un minimal sync dans le doctstring de la fonction)



class TimestampBoundedData(NamedTuple):
device_id: DeviceID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pas mieux de parler de creator ici ?

class TimestampBoundedData(NamedTuple):
device_id: DeviceID
updated: Pendulum
is_dir: bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

J'ai l'impression qu'on ne parle pultôt de folder dans le workspace (on a juste mkdir, iterdir, rmdir et is_dir dans l'api haut niveau de workspacefs pour imiter pathlib mais c'est tout).
Du coup je mettrais k.is_folder plutôt

previous = item
if previous:
new_list.append(previous)
return {k: TimestampBoundedData(*v.data, v.source, v.destination) for k, v in new_list}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On ne peut pas directement construire le dict finale au lieu d'avoir new_list ?

for item in versions_list:
if previous is not None:
# If same entry_id and version
if previous[0].id == item[0].id and previous[0].version == item[0].version:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trop de [0], il faut mieux avoir des variables intermédiaires qui aident à comprendre ce qu'on manipule

pourquoi ne pas remplacer previous par previous_ts_bounded_entry et previous_manifest_data_and_path ?
pareil pour item qu'on peut décomposer dès l'itérateur for item_ts_bounded_entry, item_manifest_data_and_path in version_list

(oui les noms son super long, faudrait peut être en trouver des mieux ^^)

Tuple[Tuple[DeviceID, Pendulum, bool, int], FsPath, FsPath],
]:
workspacefs, path: FsPath, remove_supposed_minimal_sync: bool = True
) -> Dict[TimestampBoundedEntry, TimestampBoundedData]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

En regardant où est utilisé list_versions (enfin plutôt WorkspaceFS.versions), je ne comprends pas pourquoi on a besoin de retourner quelque chose de si compliqué:

  • dans ts_ws_dialog.py, on a juste besoin du timestamp de la v0 du workspace manifest (on pourrait obtenir ça en faisant juste un workspacefs.remote_loader.list_versions sur l'id du workspace)
  • dans file_history_dialog.py, on itère sur les clé et valeurs donc autant retourner une liste d'instances d'une seule structure

@c4ffein c4ffein merged commit e71cc51 into master Nov 8, 2019
@FirelightFlagboy FirelightFlagboy deleted the fix-remove-duplicates-in-list-versions branch May 9, 2023 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

File history window contains redundant entries (ubuntu client)
2 participants