---
title: Array type conversion in Postgres
description: Exploring type conversion behaviour with peewee ORM and postgres 
date: 2-04-2024
categories:
  - debugging
  - postgres
  - sql
  - orm
  - peewee
  - psycopg2
  - python
image: 'pg-btree.png'
format:
    html:
        toc: true
        toc-location: left
        number-sections: true
---

Using an Array type for a column in postgres provides a convenient way to store multiple values for a single row without having to create a foreign key and linking to a new table. 

For example, if we have users we might want to store their data in the following format:

| id | username  | password_hash |
|----|-----------|-----------------|
| 1  | ismailmo1 |                 |

Usually when developing in an application in python for example, you might use an object relational mapping (ORM) library like SQLAlchemy or Peewee to map your database rows to Python objects which can help develop applications quicker by adding a layer of abstraction over SQL, although the value of using ORMs vs raw SQL has long been debated so it's not guaranteed that this abstraction is a good thing!

Using peewee, we can represent the user table with the following model definition:

In [None]:
from peewee import PostgresqlDatabase, Model, CharField
import bcrypt

psql_db = PostgresqlDatabase("postgres", host="localhost", port=5432, user="postgres")


class BaseModel(Model):
    """A base model that will use our Postgresql database"""

    class Meta:
        database = psql_db


class User(BaseModel):
    username = CharField()
    password_hash = CharField()

psql_db.create_tables([User])

We can then create a user with the following:

In [None]:
def hash_pwd(pwd) -> bytes:
    return bcrypt.hashpw(pwd.encode("utf-8"), bcrypt.gensalt())

user = User.create(
    username="test_user",
    password_hash=hash_pwd("password")
)

Due to a new security requirement, we require limiting reuse of old passwords. One way we can acheive this by storing old passwords and checking to make sure the new password isn't in this list.

We can take advantage of the `Array` type in postgres and redefine our `User` model to have an extra column: `previous_password_hashes`:

In [None]:
from playhouse.postgres_ext import ArrayField

class UserHighSecurity(BaseModel):
    username = CharField()
    password_hash = CharField()
    previous_password_hashes = ArrayField(CharField, null=True)

psql_db.create_tables([UserHighSecurity])


Now, each time a user changes their password we can add the current password to the array in this field (assume we store all previous passwords for simplicity here):

In [None]:
def change_password(user:UserHighSecurity, new_password:str)->None:
    hashed_password = hash_pwd(new_password)
    user.password_hash=hashed_password
    user.previous_password_hashes.insert(0, user.password_hash)
    user.save()

Testing this out with a new user we can see how this column might work in practice:

In [None]:
user_high_sec = UserHighSecurity.create(username="test", password_hash = hash_pwd("pass"), previous_password_hashes=[])
change_password(user_high_sec, "new_password")

How it's stored in the database:

|id|username|password_hash|previous_password_hashes|
|--|--------|-------------|------------------------|
|1|test|$2b$12$c8MMNDgtLFlUTbxlvGl6OuDs8V8AOu2dT3/1rhjyT7D8mHyyPsAKm|{"\\x2432622431322463384d4d4e4467744c466c555462786c76476c364f754473385638414f75326454332f3172686a79543744386d4879795073414b6d", "\\x243262243132244b5832424b696b39356c616b2f574d6c4977546c4375316873326a4c54353739677648706e774a6d6f5a64446b6d784a335641666d"}|


We now see here that the password hash has a very different format to the previous_password_hashes - even though they have the same column type definitions in peewee (and postgres).

We can narrow this down to this part of the source code in peewee (our ORM of choice here):

```python
# from peewee.py
class _StringField(Field):
    def adapt(self, value):
        if isinstance(value, text_type):
            return value
        elif isinstance(value, bytes_type):
            return value.decode('utf-8') # <- this line of code implicitly converts our password hash bytes to string
        return text_type(value)
```

This `adapt` method is called on each column/field during generation of a query when the `db_value` is accessed:

```python
# from peewee.py
class Field(ColumnBase):
    ...
    def db_value(self, value):
            return value if value is None else self.adapt(value)
...
class Insert(_WriteQuery):
    ...
    def _generate_insert(self, insert, ctx):
        ...
        columns_converters = [
            (column, column.db_value if isinstance(column, Field) else None)
            for column in columns]
```

<!-- notes
- show generated sql queries for normal case of array insert (i.e. bytes)
- show special case of one element of array as bytes with others as string 

what happens when we try add bytes into existing array with strings
insert
	into
	public."user" (username,
	password_hash,
	prev_password_hashes)
values ('test_user',
'$2b$12$xGPKwSxUtnhsUHlAsxTLoeR9owbHFYTB5Y/zfa.wA9lT4LqblxKDq',
array['\x243262243132247847504b77537855746e687355486c417378544c6f6552396f7762484659544235592f7a66612e7741396c54344c71626c784b4471'::bytea,
'$2b$12$2eSqzTNm4.nXJcUSgg8ry.HLGuAE/asdZW3wb1mPhXEl.kh8KErsS,
'$2b$12$jFSVNCloqm6O4kysK44azeDicLVfCKBCIhr23Y6N6BHoHXGPRfjua',
'$2b$12$ifaAPe69L.vcFHSnsG0eouE8iWd7s0sqwCqXFYVjMWszNDRqRMDde',
'$2b$12$Wy.XarJWzhZM58Y6NmtJ.ungymHy41qF7nT91fbhvfcZyE0ywHlXW',
'$2b$12$XpPaOIZTWGU.5ddln4OSh.BNXhDaDNamfngAKTlhqLilMB2MreaTy'])


FINAL QUERY - all bytes converted to string in python code with .decode()
insert
	into
	public."user" (username,
	password_hash,
	prev_password_hashes)
values ('test_user',
'$2b$12$xGPKwSxUtnhsUHlAsxTLoeR9owbHFYTB5Y/zfa.wA9lT4LqblxKDq',
array['$2b$12$xGPKwSxUtnhsUHlAsxTLoeR9owbHFYTB5Y/zfa.wA9lT4LqblxKDq',
'$2b$12$2eSqzTNm4.nXJcUSgg8ry.HLGuAE/asdZW3wb1mPhXEl.kh8KErsS',
'$2b$12$jFSVNCloqm6O4kysK44azeDicLVfCKBCIhr23Y6N6BHoHXGPRfjua',
'$2b$12$ifaAPe69L.vcFHSnsG0eouE8iWd7s0sqwCqXFYVjMWszNDRqRMDde',
'$2b$12$Wy.XarJWzhZM58Y6NmtJ.ungymHy41qF7nT91fbhvfcZyE0ywHlXW',
'$2b$12$XpPaOIZTWGU.5ddln4OSh.BNXhDaDNamfngAKTlhqLilMB2MreaTy'])


-->
