Reset increment #1263

rgoubet · 2022-09-15T07:47:13Z

Feature request

Unless I missed it, there doesn't seem to be a way to reset increments: if you generate data several times with the same schema, increments will pick up from the previous creation:

from mimesis import Field, Schema

_ = Field()
schema = Schema(schema=lambda: {
    "id": _('increment'),
    'name': _('full_name')})

for i in range(0,5):
    data = schema.create(5)
    print(data[0]['id'])

This returns:

Thesis

There should be an option to reset the increment each time data is generated.

Reasoning

When creating large amounts of data to export several times, you don't necessarily want increments to become huge.

The text was updated successfully, but these errors were encountered:

lk-geimfari · 2022-09-15T10:38:06Z

Hi! Actually, there is an accumulator argument for such cases: https://mimesis.name/en/master/api.html#mimesis.Numeric.increment

Here is a usage example:

>>> numeric.increment()
1
>>> numeric.increment(accumulator="a")
1
>>> numeric.increment()
2
numeric.increment(accumulator="a")
2
>>> numeric.increment(accumulator="b")
1
>>> numeric.increment(accumulator="a")
3

lk-geimfari · 2022-09-15T10:44:32Z

In your case, you are using schemas wrong way.

Instead of doing this:

for i in range(0,5):
    data = schema.create(5)
    print(data[0]['id'])

Do this:

for i in schema.create(5):
    print(i['id'])

rgoubet · 2022-09-15T12:07:40Z

In your case, you are using schemas wrong way.

In my code example, I'm trying to create 5 fullfilled schemas (that I could then export 5 times) based on the same logical schema. And here, I cannot use a new accumulator every time, unless I instantiate a new Schema object every time.

lk-geimfari · 2022-09-24T10:54:02Z

@rgoubet Sorry, I don't get the idea. Can you, please, illustrate it on example?

rgoubet · 2022-09-26T06:58:39Z

My use case is that I want to create multiple, large random data sets in Excel files (generated with openpxl) for stress test purposes. So, let's say I want to create 5 files with 1 million rows each (I use 4 columns for readability, while in practice I get 30):

from mimesis import Field, Schema
from openpyxl import Workbook

_ = Field()

schema = Schema(schema=lambda: {
    "id": _('increment'),
    "timestamp": _('datetime'),
    'version': _('version'),
    'e-mail': _('person.email', domains=['argenx.com']),
    'token': _('token_hex'),
}

Now, I'll run a loop for each file, and use the iterator to preserve memory:

for i in range(0,5):
    wb = Workbook(write_only=True)
    ws = wb.create_sheet()
    for ix, v in enumerate(schema.iterator(1_000_000)):
        if ix==0:
            ws.append(list(v.keys())) # write headers
        else:
            ws.append(list(v.values())) # write data
    xl_file = os.path.join(path, f'data{str(i).zfill(3)}.xlsx')
    wb.save(xl_file)
    wb.close()

Now, it's all good, except that the id column increment continues in each file instead of restarting from 1. In my case, that could have been an issue as it can then become a larger number than I would want for the data type I want (turned out ok in the end).

As I said, maybe I missed something, but it would be nice to have a reset option (e.g. in the create and iterator methods) for the increments. Not critical at all, though.

stale · 2023-06-18T16:10:57Z

This issue has been automatically marked as stale because it has not had activity. It will be closed if no further activity occurs. Thank you for your contributions.

lk-geimfari self-assigned this Sep 15, 2022

lk-geimfari added the question The question label Sep 15, 2022

stale bot added the stale label Jun 18, 2023

lk-geimfari closed this as completed Sep 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reset increment #1263

Reset increment #1263

rgoubet commented Sep 15, 2022

lk-geimfari commented Sep 15, 2022 •

edited

Loading

lk-geimfari commented Sep 15, 2022 •

edited

Loading

rgoubet commented Sep 15, 2022

lk-geimfari commented Sep 24, 2022

rgoubet commented Sep 26, 2022

stale bot commented Jun 18, 2023

Reset increment #1263

Reset increment #1263

Comments

rgoubet commented Sep 15, 2022

Feature request

Thesis

Reasoning

lk-geimfari commented Sep 15, 2022 • edited Loading

lk-geimfari commented Sep 15, 2022 • edited Loading

rgoubet commented Sep 15, 2022

lk-geimfari commented Sep 24, 2022

rgoubet commented Sep 26, 2022

stale bot commented Jun 18, 2023

lk-geimfari commented Sep 15, 2022 •

edited

Loading

lk-geimfari commented Sep 15, 2022 •

edited

Loading