Skip to content

Commit

Permalink
NEOS-433: added escape to title ids to fix acorn parsing error (#975)
Browse files Browse the repository at this point in the history
  • Loading branch information
evisdrenova authored Dec 26, 2023
1 parent 2b5c765 commit 03be4ba
Show file tree
Hide file tree
Showing 7 changed files with 163 additions and 49 deletions.
42 changes: 42 additions & 0 deletions docs/docs/guides/creating-a-data-gen-job.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
title: Creating a Data Generation Job
id: creating-a-data-gen-job
hide_title: false
slug: /guides/creating-a-data-gen-job
---

import { DocsImage } from '@site/src/CustomComponents/DocsImage.tsx';

## Introduction

In this guide we will walk through how to create a [data generation job](/core-concepts#jobs). Data generation jobs are used to populate a database or datastore with freshly created synthetic data. Some usecases of data generation jobs are:

1. Creating training data for machine learning usecases such as training a model
2. Augmenting your existing database with more data for performance and scalability testing
3. Generating data for demo environments

## Creating a Data Generation Job

In order to create a data generation job:

1. On the **Jobs** page, click on the **+ New Job** button.

<DocsImage href="https://assets.nucleuscloud.com/neosync/docs/jobs-page.png" />

2. Select the **Data Generation** job type.

<DocsImage href="https://assets.nucleuscloud.com/neosync/docs/job-type.png" />

3. Then give your job a **Name**. Next, if you want your job to run on a schedule, click on the schedule switch to expose an input where you can provide a cron string. Your job will run on this schedule. Lastly, activiate the **Initiate Job Run** switch if you want to immediately trigger a single job run once the job is completed. Click **Next** once you're ready.

<DocsImage href="https://assets.nucleuscloud.com/neosync/docs/new-data-gen-job-define.png" />

4. Select your destination(s) connection. You may also configure your destination with the provided configuration options.

<DocsImage href="https://assets.nucleuscloud.com/neosync/docs/new-data-gen-job-connect.png" />

5. Next is the Schema page. Here you can select how you want to transform your tables and columns with [**Transformers**](/core-concepts#transformers). Select your schema and the table you want to transform and then the number of rows you want to generate. There are a number of [transformers](/transformers/system) that Neosync ships with out of the box or you can create your own custom transformer. Once you're done, you can click **Next**.

<DocsImage href="https://assets.nucleuscloud.com/neosync/docs/new-data-gen-job-schema.png" />

7. Congrats! You successfully created a job. From here, you will be taken to the Job Details page where you can pause, resume, run or update the job you created.
46 changes: 46 additions & 0 deletions docs/docs/guides/creating-a-sync-job.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
title: Creating a Sync Job
id: creating-a-sync-job
hide_title: false
slug: /guides/creating-a-sync-job
---

import { DocsImage } from '@site/src/CustomComponents/DocsImage.tsx';

## Introduction

In this guide we will walk through how to create a [sync job](/core-concepts#jobs). Sync jobs are used to sync data between a source and one or many destinations. Some usecases of sync jobs are:

1. Syncing and anonymizing prod data to lower level environments
2. Syncing data between two lower level environments with no transformations
3. Syncing and anonymizing data to be used for analytical and machine learning usecases such as training a model

## Creating a Sync Job

In order to create a sync job:

1. On the **Jobs** page, click on the **+ New Job** button.

<DocsImage href="https://assets.nucleuscloud.com/neosync/docs/jobs-page.png" />

2. Select the **Data Synchronization** job type.

<DocsImage href="https://assets.nucleuscloud.com/neosync/docs/job-type.png" />

3. Then give your job a **Name**. Next, if you want your job to run on a schedule, click on the schedule switch to expose an input where you can provide a cron string. Your job will run on this schedule. Lastly, activiate the **Initiate Job Run** switch if you want to immediately trigger a single job run once the job is completed. Click **Next** once you're ready.

<DocsImage href="https://assets.nucleuscloud.com/neosync/docs/new-sync-job-definition.png" />

4. Select your source and destination(s) connections. You may only select one source but you can select mutiple desintations. You may also configure your source and destination with the provided configuration options.

<DocsImage href="https://assets.nucleuscloud.com/neosync/docs/new-sync-job-connections.png" />

5. Next is the Schema page. Here you can select how you want to transform your tables and columns with [**Transformers**](/core-concepts#transformers). There are a number of [transformers](/transformers/system) that Neosync ships with out of the box or you can create your own custom transformer.

<DocsImage href="https://assets.nucleuscloud.com/neosync/docs/new-job-sync-schema.png" />

6. Lastly, you can configure a [subset](/core-concepts#subset). A subset is a way to filter the data that is being synced to the destination(s). A common use-case is to filter the data to reduce the size or dimensionality of the data. You can subset the data using WHERE filters by typing in the filter in the filter box. At the same time, you'll see your `WHERE` filter being constructured and you can click on the **Validate** button to validate that the subset query will successfully execute against the schema. Click **Next** once you're done.

<DocsImage href="https://assets.nucleuscloud.com/neosync/docs/new-sync-job-subset.png" />

7. Congrats! You successfully created a job. From here, you will be taken to the Job Details page where you can pause, resume, run or update the job you created.
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
---
title: Github Actions
id: github-actions
title: Using Neosync in CI
id: using-neosync-in-ci
hide_title: false
slug: /github-actions
slug: /guides/using-neosync-in-ci
---

import { DocsImage } from '@site/src/CustomComponents/DocsImage.tsx';

## Introduction

Continuous Integration is a primary usecase for utilizing Neosync. It's often the case that integration or unit tests run in CI that need good data.
It's easy enough to spin up a Postgres or other kind of database using Github Actions, but the problem becomes hydrating that database with solid data that can be used for testinng purposes.
It's easy enough to spin up a Postgres or other kind of database using Github Actions, but the problem becomes hydrating that database with solid data that can be used for testing purposes.

For this reason, we built the [neosync sync](../cli/sync.mdx) command to enable synchronizing a connection configured in Neosync to a locally hosted database, or any other database that may not otherwise be available over the internet easily.

Expand Down
Loading

1 comment on commit 03be4ba

@vercel
Copy link

@vercel vercel bot commented on 03be4ba Dec 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.