Skip to content

Conversation

@fabiansinz
Copy link
Contributor

Change of RelatinalOperand.__iter__:

  • RelatinalOperand.__iter__ now returns an iterator that yields the single tuples of the table as tuples
  • the old behaviour of returning the primary keys can be achieved by mytable.project()
  • RelatinalOperand.fetch now uses __iter__

Settings:

  • I thought it makes sense that settings.Config is a singleton since there should be one set of global settings.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.51%) to 63.65% when pulling 25463b5 on fabiansinz:master into d81e845 on datajoint:master.

@dimitri-yatsenko
Copy link
Member

The regular fetch returns numpy records. Why should the iterator yield tuples instead. Let's make it return numpy arrays with typed attributes.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.54%) to 63.69% when pulling 03b23f0 on fabiansinz:master into d81e845 on datajoint:master.

@fabiansinz
Copy link
Contributor Author

I would have preferred it otherwise, but I don't make the main design decisions. Therefore, I changed it.

Here is why I would have preferred it otherwise:

I don't like record arrays. While they are awkward to construct and deal with, I think they provide very little benefit. I thought it would have been the cleaner solution for __iter__ to return tuples. In particular, since the creating of a single np.void is done via

np.array([tuple(unpack(field) if up else field for up, field in zip(do_unpack, q))],
                           dtype=self.heading.as_dtype)[0]

I couldn't find a way to just create a single line of the above array. This means that the iteration

for animal_id, v_trace, comment in Mice():
   ...

now constructs a record array from a tuple, takes its first element, returns an np.void, only to be converted into a tuple again. This does not seem efficient/clean to me. Maybe it has advantages for restrictions or similar operations. But then, I don't see when you'd want to restrict with a full row of a table.

Does fetch get any easier? No, since constructing record arrays from existing np.voids is not straightforward. What used to be

np.array(list(self.__iter__(offset, limit, order_by, descending)), dtype=self.heading.as_dtype)

now became

np.atleast_1d(rfn.stack_arrays(list(self.__iter__(offset, limit, order_by, descending)), usemask=False))

What do we gain? We can use

for row in Mice():
   x = row.animal_id # or row['animal_id']

To me that looks just like emulating Matlab while Python can offer cleaner constructs for that. If we want the 'row.animal_id' behaviour, we could have just used namedtuples.

@eywalker
Copy link
Contributor

Looking at the current implementation, I agree with @fabiansinz that returning record array seems to be making the code rather dirty and convoluted, yet it is unclear what we are gaining from this...

@dimitri-yatsenko
Copy link
Member

The result of the iteration needs to preserve the attribute names. So at
the very least it's a dict.

On Sun, May 10, 2015 at 11:37 AM, Edgar Y. Walker notifications@github.com
wrote:

Looking at the current implementation, I agree with @fabiansinz
https://github.com/fabiansinz that returning record array seems to be
making the code rather dirty and convoluted, yet it is unclear what we are
gaining from this...


Reply to this email directly or view it on GitHub
#62 (comment)
.

@fabiansinz
Copy link
Contributor Author

What do we need that for? For restrictions? dicts potentially shuffle the order. OrderedDict would be better. However, both are awkward to stack together into a record array in fetch. I think a namedtuple would be better and efficient.

mice_tuple = namedtuple('mice_tuple', ['animal_id', 'data','comment'], verbose=False)
t = mice_tuple(2, np.random.randn(2,3), 'no comment')

You can use it like

In [18]: t
Out[18]: 
mice_tuple(animal_id=2, data=array([[ 1.32082638, -1.07125766, -0.82236867],
       [-0.27063339,  0.15236165,  0.83204986]]), comment='no comment')

In [19]: t.animal_id
Out[19]: 2

In [20]: t.data
Out[20]: 
array([[ 1.32082638, -1.07125766, -0.82236867],
       [-0.27063339,  0.15236165,  0.83204986]])

In [21]: ai, d, c = t

@dimitri-yatsenko
Copy link
Member

exactly

On Sun, May 10, 2015 at 1:28 PM, Fabian Sinz notifications@github.com
wrote:

What do we need that for? For restrictions?


Reply to this email directly or view it on GitHub
#62 (comment)
.

@fabiansinz fabiansinz closed this May 12, 2015
@eywalker eywalker assigned eywalker and unassigned eywalker May 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants