# How to Manage Attribute and Data Access in Python Classes
## Emposing restrictions to protect your code
![](images/pexels.jpg)
<figcaption style="text-align: center;">
    <strong>
        Photo by 
        <a href='https://www.pexels.com/@pixabay?utm_content=attributionCopyText&utm_medium=referral&utm_source=pexels'>Pixabay</a>
        on 
        <a href='https://www.pexels.com/photo/black-android-smartphone-on-top-of-white-book-39584/?utm_content=attributionCopyText&utm_medium=referral&utm_source=pexels'>Pexels</a>
    </strong>
</figcaption>

In our open-source Python community, all class, OOP data is technically public. If you want, you can change much of the internal logic by tapping into the source code and ruin everything for yourself. If your programming background is from another language such as Java or C, you might be surprised that is the case. 
You might be even more surprised that these type of *screw-ups* almost never happen among Python developers. So, what is our secret?

Turns out, the fundamental principle behind much of Python code design is "we are all adults here". If you have never heard this before, now is the time you learned that we, as Python developers should trust and respect each other's code.

All OOP code in Python have universal naming conventions and special keywords (a *code* among us, if you will) that indicate the fact that even though the code is public, it is not meant for external use. These would be the parts of the code that work soundless under the hood and never meant for public use.

In this article, you will learn about these naming conventions and special attributes called properties that help you manage access to the private internals of your Python classes.


### Internal attributes and methods

The first naming convention widely accepted among Python developers is using a single leading underscore to indicate a method or an attribute is private:

```python
class Example:
    def __init__(self, x):
        
        self._x = x  ## Private attribute
        ...
        
    def _helper_function(self):  ## Private method
        ...
        
```

This also tells other users of your class that these methods or attributes are not part of the public API and can change without any notice or deprecation warning. 

You would usually use this syntax to write helper functions or attributes. For example, there are a number of internal functions Sklearn uses such as `_num_samples()` and many others that exist under `sklearn.utils.validation`. Similarly, `datetime` module has a `_ymd2ord()` function that converts a date to the number of days that have passed since January 1, Year 1. 

Another naming convention is similar to the first. In the cases where a developer really wants a method or an attribute to stay *private* or as close to private as possible is to use double leading underscores:

In [9]:
class Example:

    __example_attr = "I am private"  ## Private attribute

    def __init__(self):
        pass

    def __example_method(self):  ## Private method
        print("I am private")

This is not some secret agreed-upon convention but a standard used by the Python language itself to protect methods and attributes to be overridden or get changed accidentally. Even though these attributes look normal, you cannot access them in a regular way:

In [10]:
obj = Example()
print(obj.__example_attr)

AttributeError: 'Example' object has no attribute '__example_attr'

Tapping into such private data requires you to use `obj._ClassName__attrname` syntax:

In [11]:
obj._Example__example_attr

'I am private'

In [13]:
obj._Example__example_method()

I am private


Whenever a developer creates a method or attribute with a leading double-underscore, Python automatically prepends `_ClassName` to them. This has the above-mentioned benefit of signaling regular users that it is private data. Also, when someone inherits from this class, they will not accidentally override or change the behavior of the class by redeclaring names that already exists in the parent class.

Be careful with this syntax and don't confuse it with Python's syntax for built-ins. Python uses leading *and ending* double-underscores to indicate built-in functions.

Finally, no one prevents you from using them but these naming conventions are a developer's way of asking 'don't touch this' and being an *adult*, it is your responsibility to honor this trust (sorry for being dramatic😁).

### Customizing access to attributes

Let's we have this simple Book class:

In [14]:
class Book:
    def __init__(self, title, author, n_pages):
        self.title = title
        self.author = author
        self.n_pages = n_pages

We know that you can access and change any attribute of a class instance:

In [17]:
book1 = Book("Winds of Winter", "George R. R. Martin", 500)

# You can change the value of the attributes
book1.title = "To be published in another 10 years"
book1.author = 72
book1.n_pages = "3000 pages of manuscript"

Even though this gives flexibility, you can assign any value and data type you want to these attributes. Sure, you could have implemented some logic that validates each attribute value but your constructor would get messy.

Title, author and the number of pages of a book are its important qualities, so they should be allowed to change so easily.

So, how do we control the access to these attributes or even make them read-only? There are already many examples of this in other classes you may have used. For example, you cannot change the shape of a Pandas DataFrame after it has been created or can not pass invalid values to its constructor:

In [18]:
import pandas as pd

df = pd.DataFrame()
df.shape = (4, 5)

  df.shape = (4, 5)


AttributeError: can't set attribute

In [19]:
pd.DataFrame("Hello Python")

ValueError: DataFrame constructor not properly called!

We can implement similar behavior by using tools called Python descriptors. The full extent of descriptors is way outside the scope of this article but in this article, we will only be needing one of them: the `@property` descriptor.

We start by creating a 'protected' attribute we discussed in the previous section:

In [22]:
class Book:
    def __init__(self, title, author, n_pages):
        self._n_pages = n_pages  ## Pay attention to the attribute name

    @property
    def n_pages(self):
        print("Getter method is called!")
        return self._n_pages

Then, we create a method with exact same name but without the leading underscore that returns the protected attribute. This method does not much except that we can get a book's title using the old dot-syntax:

In [23]:
book = Book("Winds of Winter", "GRRM", 500)
book.n_pages

Getter method is called!


500

Even though `n_pages` is a method, wrapping it with the `@property` decorator converts it to an attribute. It gets called whenever we try to access the `n_pages` attribute of the books. This is called customizing the access of the `_n_pages` internal attribute. Interesting thing is that now we cannot modify it with:

In [24]:
book.n_pages = 400

AttributeError: can't set attribute

Great! Let's do the same procedure for other book attributes because they should not be changed too:

In [25]:
class Book:
    def __init__(self, title, author, n_pages):
        self._n_pages = n_pages
        self._title = title
        self._author = author

    @property
    def n_pages(self):
        return self._n_pages

    @property
    def title(self):
        return self._title

    @property
    def author(self):
        return self._author

OK, we learned how to make attributes read-only. Now, let's learn how to implement validation for their initialization.

Let's say we have a simple shirt class that has price, size, color and style attributes. Since prices of shirts change regularly, we want to make sure that the changes comply with out companies standards. In other words, we want to validate setting and modifying the prices of our shirts.

Once again, we start by defining a 'protected' attribute and creating its namesake for accessing it. What's new is the next method which validates that the new price is not lower than minimum and below 0:

In [41]:
class Shirt:
    def __init__(self, price, size, color, style):
        self._price = price

    @property
    def price(self):
        return self._price

    @price.setter
    def price(self, new_price):
        if new_price < 0:
            raise ValueError("Invalid price!")
        else:
            self._price = new_price

This validation gets fired off whenever we are modifying existing shirt's price. Notice how the method is wrapper with a decorator with `@attrname.setter` syntax. This tells Python to use this method only when we are modifying the internal `_price` attribute using the assignment operator.

In [42]:
shirt1 = Shirt(25, "XL", "black", "short-sleeve")
shirt1.price

25

In [44]:
shirt1.price = -45

ValueError: Invalid price!

There are other property types you can use. For example, there is also `@attr.deleter` property which gets called whenever someone tries to delete an object's property:

In [45]:
class Shirt:
    def __init__(self, price, size, color, style):
        self._price = price

    @property
    def price(self):
        return self._price

    @price.setter
    def price(self, new_price):
        if new_price < 0:
            raise ValueError("Invalid price!")
        else:
            self._price = new_price

    @price.deleter
    def price(self):
        raise AttributeError("Cannot delete the price!")

In [46]:
shirt2 = Shirt(50, "L", "white", "long-sleeve")

del shirt2.price

AttributeError: Cannot delete the price!

Finally, I would like to point out that you can have the same name for all properties (e.g. `price`) without overriding others because of the property decorators.