Skip to content

Commit

Permalink
docs: Revises the README (#295)
Browse files Browse the repository at this point in the history
  • Loading branch information
adlersantos committed Feb 11, 2022
1 parent 6fe5f71 commit b71b113
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Cloud-native, data pipeline architecture for onboarding public datasets to [Data

# Environment Setup

We use Pipenv to make environment setup more deterministic and uniform across different machines. If you haven't done so, install Pipenv using the instructions found [here](https://pipenv-fork.readthedocs.io/en/latest/install.html#installing-pipenv).
We use Pipenv to make environment setup more deterministic and uniform across different machines. If you haven't done so, install Pipenv using these [instructions](https://pipenv-fork.readthedocs.io/en/latest/install.html#installing-pipenv).

With Pipenv installed, run the following command to install the dependencies:

Expand Down Expand Up @@ -72,7 +72,7 @@ Every YAML file supports a `resources` block. To use this, identify what Google

- BigQuery datasets and tables to store final, customer-facing data
- GCS bucket to store downstream data, such as those linked to in the [Datasets Marketplace](https://console.cloud.google.com/marketplace/browse?filter=solution-type:dataset).
- Sometimes, for very large datasets that requires processing to be parallelized, you might need to provision a [Dataflow](https://cloud.google.com/dataflow/docs) (i.e. Apache Beam) job
- Sometimes, for very large datasets that requires processing to be parallelized, you might need to provision a [Dataflow](https://cloud.google.com/dataflow/docs) (Apache Beam) job


## 3. Generate Terraform files and actuate GCP resources
Expand Down Expand Up @@ -101,7 +101,7 @@ The `--tf-state-bucket` and `--tf-state-prefix` parameters can be optionally use

In addition, the command above creates a "dot env" directory in the project root. The directory name is the value you set for `--env`. If it's not set, the value defaults to `dev` which generates the `.dev` folder.

Consider a dot directory as your own sandbox, specific to your machine, that's mainly used for prototyping. As will be seen later, this directory is where you will set the variables specific to your environment: such as actual GCS bucket names, GCR repository URLs, and secrets (we recommend using [Secret Manager](https://cloud.google.com/composer/docs/secret-manager) for this). The files and variables created or copied in the dot directories are isolated from the main repo, i.e. all dot directories are gitignored.
We strongly recommend using a dot directory as your own sandbox, specific to your machine, that's mainly used for prototyping. This directory is where you will set the variables specific to your environment: such as actual GCS bucket names, GCR repository URLs, and secrets (we recommend using [Secret Manager](https://cloud.google.com/composer/docs/secret-manager) for this). The files and variables created or copied in the dot directories are isolated from the main repo, meaning that all dot directories are gitignored.

As a concrete example, the unit tests use a temporary `.test` directory as their environment.

Expand Down

0 comments on commit b71b113

Please sign in to comment.