Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add seed people and companies data for demo environment (#2207) #2307

Merged
merged 15 commits into from
Dec 2, 2023

Conversation

khakimov
Copy link
Contributor

@khakimov khakimov commented Nov 2, 2023

This pull request adds seed data (600 companies + 1548 people) for a demo environment. The data is going to be used to populate the database for demonstration purposes.

@FelixMalfait
Copy link
Member

Great! Thank you
The company data is perfect.
As for the people, I'm a little worried about the data privacy impact of including real people, I don't think we should.
Should we instead generate those names/emails once with a faker library?
Bonus (not easy): scrape something like https://generated.photos/faces and add base64 images for each contact :)

@khakimov
Copy link
Contributor Author

khakimov commented Nov 3, 2023

Great! Thank you The company data is perfect. As for the people, I'm a little worried about the data privacy impact of including real people, I don't think we should. Should we instead generate those names/emails once with a faker library? Bonus (not easy): scrape something like https://generated.photos/faces and add base64 images for each contact :)

def generate_person(company_name):
    gender = random.choice(['male', 'female'])
    if gender == 'male':
        firstName = fake.first_name_male()
    else:
        firstName = fake.first_name_female()
    lastName = fake.last_name()
    random_number = secrets.token_hex(5)  # generates a 10 character long hexadecimal string
    url = get_image_url(gender)
    base_image = download_image(url)
    print(f"Saving {firstName}, {gender}...")

    person = {
        "company": company_name,
        "firstName": firstName,
        "lastName": lastName,
        "city": fake.city(),
        "email": f"{firstName.lower()}.{lastName.lower()}@example.com",
        "avatarUrl": f"data:image/jpeg;base64,{base_image}",
        "linkedinUrl": f"/in/{firstName.lower()}-{lastName.lower()}-{random_number}",
        "jobTitle": fake.job()
    }
    return person

agree about privacy, here is updated list of people_demo.json with avatars in base64, format as showed in person above. For each company generated 2 people (=1200 total)

@FelixMalfait
Copy link
Member

Great!
Last round, could you:

  • remove "company" field from people.json (not needed as the relationship is created after)
  • rename people_demo to people-demo and companies_demo to companies-demo
  • update createDefaultPeople so that the each group of 2 people are matched with respective company, so I think it means changing companyId: companies[i] to companyId: companies[Math.floor(i / 2)]

I'll create an issue to add pagination because the app is almost crashing!

@khakimov
Copy link
Contributor Author

khakimov commented Nov 3, 2023

Let me know if that looks better and if any changes needed.

Next I will look into how to create proper pipeline-demo.json based on this data and implement reset endpoint from #2207

@charlesBochet
Copy link
Member

Taking a look this morning!

@charlesBochet
Copy link
Member

charlesBochet commented Nov 9, 2023

@khakimov Thank you, we are holding a bit on this one as we are migrating companies and people models to the new schema. We will merge it once the new models are ready

@charlesBochet
Copy link
Member

@khakimov, we have refactored the way we seed a workspace using the new schema.
Could you introduce a new command (similar to DataSeedTenantCommand) to seed a DemoTenant.

You'll need to your seeds into typeorm-seeds/tenant/demo
We are still thinking about how we should re-organize folders but we are sure about the seed format. Would be great to split the demo seeds into files smaller files to ease readibility

@khakimov
Copy link
Contributor Author

khakimov commented Nov 20, 2023

started working on seeding data with new schema (pushed some code above already), @charlesBochet let me know if organization of files is suitable. * I will adjust people-demo.json for new format as well.

I will use DataSeedWorkspaceCommand to make DemoTenant and put together the rest of demo seeding process.

@charlesBochet
Copy link
Member

@Weiko @magrinj could you take a look at these demo seeds? Would be awesome to have them to demo Twenty performances

@charlesBochet
Copy link
Member

Discussed with you on Discord @khakimov, I'm available if you need a hand to complete this one!

@charlesBochet charlesBochet self-assigned this Nov 28, 2023
// TODO: get workspaceId from .env
workspaceId = '20202020-1c25-4d02-bf25-6aeccf7e1337';
dataSourceId = '20202020-7f63-47a9-b1b3-6c7290ca1337';
workspaceSchemaName = 'workspace_1wgvd1injqtife6y4rvfb1337';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that you add 1337 at the end of the name of Ids. Why not! However for workspaceSchemaName, it might not work because in some places it's computed based on workspaceId. Might lead to bugs :)

await dataSource.initialize();

await seedCoreSchema(dataSource, this.workspaceId);
await seedMetadataSchema(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this won't actually work because these metadata seeds have hardcoded ids.
Could you leverage:

    const createdObjectMetadata =
      await this.createStandardObjectsAndFieldsMetadata(
        dataSourceMetadata.id,
        workspaceId,
      );

instead see from workspace-manager.service.ts

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think about it, what we want to do is exactly what we have in the init method from workspace-manager.service.ts.

Change in strategy:

  1. Keep what you have to create the core entities
  2. revert the changes to the typeorm-seeds/metadata
  3. create a initDemo in workspace-manager.service.ts that is doing exactly what we want but instead of "prefillWorkspaceWithStandardObjects" we want "prefillWorkspaceWithDemoObjects"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work like a charm, sorry I did not think about it earlier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your quick review, I will adjust accordingly!

- delete workspace
- initDemo() with prefillWorkspaceWithDemoObjects()

* added companies-demo.ts with data
* added people-demo.ts with data
Copy link
Member

@charlesBochet charlesBochet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is gold!
So, next step:

  • revisit table performance, we really need to put back in place Implement table record virtualizer back #2728 and make sure we don't have re-renders on our table component. I have removed many to make it usable but we will need to another pass.
  • @lucasbordeau apollo onCompleted callback is not triggered on fetchMore, I have made a patch in the code base in useFindManyRecords but I think we can do better to remove duplicated code

Then, I think we should move dev seeds to the same place as prefills and demo seeds

import { EnvironmentService } from 'src/integrations/environment/environment.service';
import { WorkspaceManagerService } from 'src/workspace/workspace-manager/workspace-manager.service';

// TODO: implement dry-run
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for dry-run for seed command

@@ -142,6 +142,23 @@ export const useFindManyRecords = <
...fetchMoreResult?.[objectMetadataItem.namePlural]?.edges,
]);
}
onCompleted?.({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charlesBochet charlesBochet merged commit fd9467c into twentyhq:main Dec 2, 2023
3 of 5 checks passed
@charlesBochet charlesBochet mentioned this pull request Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants