Skip to content
This repository has been archived by the owner on Aug 27, 2023. It is now read-only.

Exception using scan with undefined name field #44

Closed
t-unit opened this issue Jul 19, 2016 · 8 comments
Closed

Exception using scan with undefined name field #44

t-unit opened this issue Jul 19, 2016 · 8 comments

Comments

@t-unit
Copy link

t-unit commented Jul 19, 2016

There seems to be a bug where flywheel is trying to decode any string value for not defined name fields.

Conditions seem to be:

  • name field present in dynamodb table while no corresponding field defined in model
  • custom table name defined using __metadata__ and _name
  • objects are obtained from dynamodb using scan and not get (did not check query yet)

Sample Code

from flywheel import Model, Field, Engine
from flywheel.fields.types import StringType
import uuid

class Item(Model):
    __metadata__ = {
        '_name': 'items'
    }
    uuid = Field(data_type=StringType(), hash_key=True, nullable=False)
    # name = Field(data_type=StringType())  # only not having this field defined will let the script crash

# setup
engine = Engine()
engine.connect_to_region('eu-west-1')
engine.register(Item)
engine.create_schema()

# create an item
unique_id = str(uuid.uuid1())
i = Item()
i.uuid = unique_id
i.name = 'a name' # using name (which is not a field) while _name is defined in __metadata__
engine.save(i)

# scan for
engine.get(Item, [unique_id]) # no crash
engine.scan(Item).filter().all() # crash happens here

Stack Trace

Traceback (most recent call last):
  File "addition_field.py", line 27, in <module>
    engine.scan(Item).filter().all() # crash happens here
  File "/usr/local/lib/python3.5/site-packages/flywheel/query.py", line 115, in all
    exclusive_start_key=exclusive_start_key))
  File "/usr/local/lib/python3.5/site-packages/flywheel/query.py", line 301, in gen
    yield self.model.ddb_load_(self.engine, result)
  File "/usr/local/lib/python3.5/site-packages/flywheel/models.py", line 493, in ddb_load_
    obj.set_ddb_val_(key, val)
  File "/usr/local/lib/python3.5/site-packages/flywheel/models.py", line 485, in set_ddb_val_
    setattr(self, key, Field.ddb_load_overflow(val))
  File "/usr/local/lib/python3.5/site-packages/flywheel/fields/__init__.py", line 265, in ddb_load_overflow
    return json.loads(val)
  File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/__init__.py", line 319, in loads
    return _default_decoder.decode(s)
  File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/json/decoder.py", line 357, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

edit: I noticed something more. Actually running the script above works fine. But when you have a closer look at the content of the table / raw data returned by dynamo3for the scan you will notice the additional ": {'name': '"a name"', 'uuid': '5ebc745a-4def-11e6-864d-20c9d07f4883'}

Manually removing them (e.g. {'name': 'a name', 'uuid': '6b87e066-4def-11e6-a7b2-20c9d07f4883'}) will trigger the exception/bug described above. I didn't dig deeper why and where these " get inserted. But there is definitely a problem relying on them when the data in the table also gets touched by other software.

@stevearc
Copy link
Owner

Thanks for reporting! I gave it a quick go and wasn't able to reproduce the crash. The weird data format you're seeing returned makes me suspicious of the lower-level libraries. Could you give me the output of a pip freeze? I'm particularly interested in the versions of dynamo3 and botocore. It wouldn't be the first time a dependency updated and broke some core functionality.

@t-unit
Copy link
Author

t-unit commented Aug 8, 2016

I did some more investigating. I found a way to clearly reproduce the error on my side.

Environment

$ python --version
Python 3.5.2
$ pip freeze
botocore==1.4.43
docutils==0.12
dynamo3==0.4.9
flywheel==0.4.9
jmespath==0.9.0
python-dateutil==2.5.3
six==1.10.0

(I used a clean virtual environment for testing)

Steps to reproduce

Step 1 run this script (name field declared)

from flywheel import Model, Field, Engine
from flywheel.fields.types import StringType
import uuid

class Item(Model):
    __metadata__ = {
        '_name': 'test'
    }
    uuid = Field(data_type=StringType(), hash_key=True, nullable=False)
    name = Field(data_type=StringType())

# setup
engine = Engine()
engine.connect_to_region('eu-west-1')
engine.register(Item)
engine.create_schema()

# create an item
unique_id = str(uuid.uuid1())
i = Item()
i.uuid = unique_id
i.name = 'a name'
engine.save(i)

Step 2 run following script (no name field declared)

from flywheel import Model, Field, Engine
from flywheel.fields.types import StringType
import uuid

class Item(Model):
    __metadata__ = {
        '_name': 'test'
    }
    uuid = Field(data_type=StringType(), hash_key=True, nullable=False)

# setup
engine = Engine()
engine.connect_to_region('eu-west-1')
engine.register(Item)

# scan
engine.scan(Item).all() # crash happens here

(same trace as in the original post)

Hope you can reproduce it this time.

@t-unit
Copy link
Author

t-unit commented Aug 9, 2016

Got a related question or even the answer why this issue exists. I looked a bit at your unit tests. In tests/test_fields.py:432:

"Extra string fields are stored as json strings" – what is reason behind this design decision?

Doesn't it exactly break with my example above? Storing some fields first without transforming it to json and later try reading it as an overflow field that expects the stored value to valid json?

@stevearc
Copy link
Owner

I mention this a bit in #31, but basically it was just a horrible decision that should never have been made. I've even removed it in the 0.5 branch that I've been sitting on for more than a year. I wanted to get some more features into it before releasing it (like migrating to the newer dynamo query API), but I sadly don't have much time to work on flywheel anymore.

Since you're hitting this now and I've seen a couple other people be bit by it, I'm going to try to find a couple hours this week to tidy up what I have on 0.5 and release it. I'm sorry you had to waste time on this and hopefully we'll bury it soon.

@t-unit
Copy link
Author

t-unit commented Aug 22, 2016

Thanks for clarifying it. Looking forward to 0.5

@t-unit t-unit closed this as completed Aug 22, 2016
@stevearc
Copy link
Owner

0.5.0 is out!

@magicronn
Copy link

Do you have an Amazon Wishlist or favorite charity?
This excellent package has been a real time saver for me.
I also may have a patch for getattr but want to see if it is irrelevant in
0.5.0 first.

Sincerely,
Ronn B
Generic Startup Guy

On Mon, Aug 22, 2016 at 10:31 PM, Steven Arcangeli <notifications@github.com

wrote:

0.5.0 https://pypi.python.org/pypi/flywheel/0.5.0 is out!


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#44 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABn__uHbTWV545VwGO9HwG84znsuexoAks5qioXNgaJpZM4JQF-H
.

@stevearc
Copy link
Owner

Glad to hear that it's been useful for you! Patches and bug reports are always welcome.

If you'd like to donate, I'd highly recommend Give Directly. They've been my favorite charity for a couple years now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants