Skip to content
This repository has been archived by the owner on Feb 4, 2020. It is now read-only.

Make leader mutable? #108

Closed
Wooble opened this issue Sep 18, 2017 · 14 comments
Closed

Make leader mutable? #108

Wooble opened this issue Sep 18, 2017 · 14 comments

Comments

@Wooble
Copy link
Collaborator

Wooble commented Sep 18, 2017

It would be really convenient to be able to mutate the leader somehow.

I find myself writing record.leader = record.leader[0:9] + 'a' + record.leader[10:] when converting MARC-8 to UTF-8 to work around the leader being an immutable string, which is pretty ugly.

(On the other hand, I'm not all that sure this could be done in a way that's backwards-compatible with things that expect record.leader to be a normal string. record.leader[9] = 'a' might be unrealistic to support, but perhaps we could add a method that flips a specific part of the leader?)

@souzaluuk
Copy link
Contributor

Maybe go to the list type and use the python getter / setter methods to make it easier.
But I do not know about the compatibility of the idea with the other versions of python.

@edsu
Copy link
Owner

edsu commented Jul 8, 2019

A list would seem to make more sense. We could consider a new major version of pymarc in order to break backwards compatibility.

@souzaluuk
Copy link
Contributor

I can try, so we can see the best way.
What do you think?

@Wooble
Copy link
Collaborator Author

Wooble commented Jul 8, 2019

I think a list makes it harder to get at the pieces of the leader that use more than 1 character of the string.

I suspect a method would work better, although I don't have a good suggestion for a name or even the API.

MARC::Record solves my particular use case by having $record->encoding('UTF-8') set leader[9] to "a" and $record->encoding('MARC-8') set it to " ", although it's got nothing for changing other things in the leader besides the length/base address.

@herrboyer
Copy link
Contributor

herrboyer commented Jan 7, 2020

Hi, we're facing the same kind of issue right now, not blocking at all, but I was wondering about creating a maybe backward compatible Leader object like this one :

class Leader:
    """Mutable leader.

    Usage:
    leader = Leader("00475cas a2200169 i 4500")
    leader[0:4]  # 00475
    leader.status  # "c"
    leader.status = "a"
    leader[6] = "b"
    leader.record_type  # "b"
    str(leader)  # "00475abs a2200169 i 4500"
    """

    def __init__(self, value: str):
        self.value = value

    def __getitem__(self, item) -> str:
        """leader[:4] == leader.length."""
        if isinstance(item, slice) or isinstance(item, int):
            return self.value[item]
        return getattr(self, item)

    def __setitem__(self, item, val):
        """leader[5] works, leader[0:4] too, as well as leader.status = "a"."""
        if isinstance(item, slice):
            self._update_value(start=item.start - 1, stop=item.stop + 1, val=val)
        elif isinstance(item, int):
            self._update_value(start=item - 1, stop=item + 1, val=val)
        else:
            setattr(self, item, val)

    def __str__(self) -> str:
        return self.value

    def _update_value(self, start: int, stop: int, val: str):
        """Replace the chars between `start` & `stop` to `val` in `self.value`."""
        start = max(0, start)
        stop = min(LEADER_LEN, stop)
        if len(val) != stop - start:
            raise Exception("val length should match the replaced chars")
        self.value = self.value[:start] + val + self.value[stop:]

    @property
    def length(self) -> str:
        return self.value[:4]

    @property
    def status(self) -> str:
        return self.value[5]

    @status.setter
    def status(self, status: str):
        self._update_value(start=4, stop=5, val=status)

    @property
    def record_type(self) -> str:
        return self.value[6]

    @record_type.setter
    def record_type(self, record_type: str):
        self._update_value(start=6, stop=7, val=record_type)

The idea would be to use it in Record self.leader = Leader(leader[0:10] + '22' + leader[12:20] + '4500'). It should be almost backward compatible with the leader as a string and provide helpful getters & setters to manipulate it.

It's a bit too much on the magic side of python with its heavy use of dunders though...

I could keep poking around if you think it could be a good solution to this issue.

@edsu
Copy link
Owner

edsu commented Jan 8, 2020

I really like this proposal @herrboyer. I'm curious to know what others think. Do you want to prepare a pull request?

@Wooble
Copy link
Collaborator Author

Wooble commented Jan 8, 2020

I sort of feel like just replacing the leader with a bytearray would be cleaner, although either way things are going to break :(

@herrboyer
Copy link
Contributor

Thanks for your feedback.

We'll quickly see how things will go as soon as I find some time to make a PR.

@edsu
Copy link
Owner

edsu commented Jan 8, 2020

@Wooble I actually don't see how much will break with @herrboyer's proposal. Does something pop out at you?

@Wooble
Copy link
Collaborator Author

Wooble commented Jan 8, 2020

My example workaround code in the original message in the issue, for one thing, although that one could be fixed by overriding __add__ and __radd__ on the Leader object. (Also it would be silly to be doing that anymore once the leader is mutable, but legacy code breaking is still annoying even if it's breaking because it's an ugly hack that is no longer necessary.)

Anything that just does a naive record.leader = 'some other leader' will give you a str instead of a Leader, which presumably will need to be handled on serialization or which will also fail. Also anything making the assumption that record.leader is a str object so it doesn't bother calling str() on it won't use __str__, and might break depending on what whoever's making that assumption does with it.

I'd be interested to see what breaks in the existing test suite just by simply dropping in a Leader and not changing anything else that uses record.leader. Probably no one should be doing most of the sorts of things with the leader that pymarc has to do internally, but it might point to any other gotchas.

@edsu
Copy link
Owner

edsu commented Jan 8, 2020

Good points, thanks @Wooble. So whatever the change it will probably be a major version change if we can't guarantee backwards compatibility.

@herrboyer
Copy link
Contributor

As you guessed @Wooble serialization was broken but easily fixed, let me now what you think of the PR.

@herrboyer
Copy link
Contributor

record.leader = record.leader[0:9] + 'a' + record.leader[10:] will work as expected, except record.leader would not be a Leader object anymore, but a simple string, which probably won't matter much on legacy code, don't you think @Wooble ?

I tried to do something about it but overriding add/radd won't solve it, as a slice returns a string not the Leader itself.

@Wooble
Copy link
Collaborator Author

Wooble commented Jan 8, 2020

Oh, right, you'd have to make the slicing return a weird partial Leader object for that to work, and that would definitely be going too far...

@Wooble Wooble closed this as completed Jan 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants