Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v3] LocalStore.get clarification #1745

Open
d-v-b opened this issue Apr 5, 2024 · 4 comments
Open

[v3] LocalStore.get clarification #1745

d-v-b opened this issue Apr 5, 2024 · 4 comments
Labels
types V3 Related to compatibility with V3 spec
Milestone

Comments

@d-v-b
Copy link
Contributor

d-v-b commented Apr 5, 2024

in v3, the following function gets bytes for the LocalStore.get method:

def _get(path: Path, byte_range: Optional[Tuple[int, Optional[int]]] = None) -> bytes:
    if byte_range is not None:
        start = byte_range[0]
        end = (start + byte_range[1]) if byte_range[1] is not None else None
    else:
        return path.read_bytes()
    with path.open("rb") as f:
        size = f.seek(0, io.SEEK_END)
        if start is not None:
            if start >= 0:
                f.seek(start)
            else:
                f.seek(max(0, size + start))
        if end is not None:
            if end < 0:
                end = size + end
            return f.read(end - f.tell())
        return f.read()

I have a few questions about this function that would help me write tests for it. I think @jhamman @normanrz are the main people I need answers from, but I'd be curious to hear if anyone else has thoughts.

  • the annotation of the function signature is Optional[Tuple[int, Optional[int]]], but the body of the function can handle tuple[None, int]. Should we change annotation of the function signature to tuple[int | None, int | None] | None?
  • I was initially confused about the semantics of byte_range -- I thought it defined an interval as a start and a stop, but now I understand that byte_range defines a start and a step size. This would be more transparent if we used keyword arguments like start and step instead of a tuple. Would switching to keyword arguments be a problem for this API? I think the win in clarity might be worth it.
@d-v-b d-v-b added V3 Related to compatibility with V3 spec types labels Apr 5, 2024
@d-v-b
Copy link
Contributor Author

d-v-b commented Apr 5, 2024

actually, the second conditional block can handle byte_range[0] being None, but this line

 end = (start + byte_range[1]) if byte_range[1] is not None else None

from the first conditional block errors if byte_range[0] is None, due to adding an int to None.

@jhamman jhamman added this to the 3.0.0.alpha milestone Apr 5, 2024
@normanrz
Copy link
Contributor

  • the annotation of the function signature is Optional[Tuple[int, Optional[int]]], but the body of the function can handle tuple[None, int]. Should we change annotation of the function signature to tuple[int | None, int | None] | None?

The current API is:

class ByteRange(NamedTuple):
      start: int  # Can be negative for indexing from the end 
      end: Optional[int] = None

def _get(path: Path, byte_range: Optional[ByteRange] = None) -> bytes:

Not sure why we would allow start == None. Would that be equivalent to start == 0?

  • I was initially confused about the semantics of byte_range -- I thought it defined an interval as a start and a stop, but now I understand that byte_range defines a start and a step size. This would be more transparent if we used keyword arguments like start and step instead of a tuple. Would switching to keyword arguments be a problem for this API? I think the win in clarity might be worth it.

I don't think the byte range defines a step size. Where are you getting that from?

@d-v-b
Copy link
Contributor Author

d-v-b commented Apr 15, 2024

I don't think the byte range defines a step size. Where are you getting that from?

That was my impression from the implementation

end = (start + byte_range[1]) if byte_range[1] is not None else None

@normanrz
Copy link
Contributor

Ah, I would call that length or count. Anyways, seems you're right. I don't think I ever supplied the second tuple component in zarrita 🤷

@jhamman jhamman modified the milestones: 3.0.0.alpha, 3.0.0 May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
types V3 Related to compatibility with V3 spec
Projects
Status: Todo
Development

No branches or pull requests

3 participants