Skip to content

Landmark Datasets

Latest
Compare
Choose a tag to compare
@mdsrqbl mdsrqbl released this 07 Nov 15:45
· 10 commits to main since this release

This release contains sign language videos embedded as csv files inside zip archives. The landmarks are rounded to 4 decimal places which give a precision of 0.1mm in world coordinates and 1 pixel in a 10k resolution image.

Text transcription (gloss) of the signs is present in the file names. More synonyms and translations that map to these signs can be seen in the json data in the repo. The dataset has three categories:

  • Standard Dictionary: (788 + 1)
    Standard sign language dictionaries obtained from recognized organizations. The names are country-organization-groupNumber.landmarks-embeddingModel-extension.zip

  • Dictionary Replications: (788 * 12 * 4 = 37,824) (coming soon!)
    Manually recorded sign language videos that are replication of the reference clips. The names are country-organization-groupNumber_personCode_cameraAngle.landmarks-embeddingModel-extension.zip

MediaPipe landmarks Header

World coordinates are 3D body joint coordinates in meters. Image coodinates are fraction of the video height/width where the landmark is located and z value is depth from the camera.

For both models, we get 33pose landmarks and 21 landmarks per hand and 5 values per landmark (x, y, z, visibility, presence).

total_rows = number_of_frames in source video