Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
151 changes: 82 additions & 69 deletions scripts/sync-ga-to-rc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,40 +6,76 @@ for versioned docs by comparing updates since a provided cutoff in the current

The default cutoff date is the last run date for the provided product slug, if
it exists. Otherwise, the script defaults to the creation date of the RC release
branch. The script standardizes all timestamps to ISO for simplicity but takes
the optional override date as a local time.
branch, which it calculates as the date of the first commit associated with the
branch that does not exist in main. The script standardizes all timestamps to
UTC for simplicity but takes the optional override date as a local time.


## Assumptions

- Your RC release branch use the following naming convention: `<product_slug>/<rc_version> <docTag>`.
- Your RC folder uses the following naming convention: `<product_slug>/<rc_folder> <docTag>`.
- You have the GitHub CLI (`gh`) installed. The CLI is required if you want the
script to create a PR on your behalf. ==> DISABLED (THE PROCESS IS STILL BUGGY)

## Usage

```text
node sync-ga-to-rc.mjs -slug <product> -ga <ga_folder> -rc <rc_folder> [<optional_flags>]
```

## Flags

Flag | Type | Default | Description
--------- | -------- | ------------ | -----------
`-slug` | `string` | No, required | Product slug used for the root content folder
`-ga` | `string` | No, required | Version of the current docset
`-rc` | `string` | No, required | Version of the unreleased docset
`-tag` | `string` | "" | String used to tag non-GA docsets (e.g., "(rc)")
`-branch` | `string` | `main` | Name of the GA branch
`-date` | `string` | null | Local override date in "YYYY-MM-DD HH:MM:SS" format for the commit date cutoff
`-update` | `bool` | false | Indicates whether to apply any safe changes locally
`-pr` | `bool` | false | Indicates whether to apply any safe changes locally and generate a PR if possible
`-merged` | `bool` | false | Indicates that RC docs are merged to `-branch`
`-help` | `bool` | false | Print usage help text and exit
Flag | Type | Default | Description
----------- | -------- | ---------------------------- | -----------
`-slug` | `string` | None, required | Product slug used for the root content folder
`-ga` | `string` | None, required | GA folder; typically the GA version with `.x`
`-rc` | `string` | None, required | Unreleased docset folder ; typically the RC version with `.x`
`-tag` | `string` | "" | String used to tag non-GA docsets (e.g., "(rc)")
`-branch` | `string` | `<product_slug>/<rc_folder>` | Name of the RC branch
`-gaBranch` | `string` | `main` | Name of the GA branch
`-date` | `string` | null | Local override date in "YYYY-MM-DD HH:MM:SS" format for the commit date cutoff
`-update` | `bool` | false | Indicates whether to apply any safe changes locally
`-pr` | `bool` | false | Indicates whether to apply any safe changes locally and generate a PR if possible
`-merged` | `bool` | false | Indicates that RC docs are merged to `-gaBranch`
`-help` | `bool` | false | Print usage help text and exit


## Adding exceptions

## Usage
If you have files you **know** the script should always ignore, you can add the
relative path from your product root to the exclusion file `data/exclude.json`
using your product slug.

```text
node sync-ga-to-rc.mjs -slug <product> -ga <ga_version> -rc <rc_version> [-tag <folder_tag>] [-branch <ga_branch>] [-date <override_date] [-update] [-pr]
Expected schema:

```json
[
{
"<produc_slug>": [
"<relative_path_1>",
"<relative_path_1>",
...
"<relative_path_N>",
]
}
]
```

For example:

```json
[
{
"vault": [
"/content/docs/updates/important-changes.mdx",
"/content/docs/updates/release-notes.mdx",
"/content/docs/updates/change-tracker.mdx"
]
}
]
```


## Examples

### Basic call
Expand All @@ -48,7 +84,13 @@ node sync-ga-to-rc.mjs -slug <product> -ga <ga_version> -rc <rc_version> [-tag <
$ node sync-ga-to-rc.mjs -slug vault -ga 1.20.x -rc 1.21.x -tag rc
```

### Provide an override date
### Provide an explicit release branch name

```shell-session
$ node sync-ga-to-rc.mjs -slug boundary -ga 0.20.x -rc 0.21.x -branch boundary/0.21.0
```

### Provide an override date and explicit tag string

Use `-tag` to provide a specific doc tag and set a custom override date with
`-date`:
Expand Down Expand Up @@ -76,52 +118,16 @@ $ node sync-ga-to-rc.mjs \

### Sync two published versions

Use `-merged` and set `-tag` to "none" so the script compares folders for past
versions in `main`:
Use `-merged` so the script compares folders in `main`:

```shell-session
$ node sync-ga-to-rc.mjs \
-slug vault \
-ga 1.19.x \
-rc 1.20.x \
-merged \
-tag none
```

## Adding exceptions

If you have files you **know** the script should always ignore, you can add the
relative path from your product root to the exclusion file `data/exclude.json`
using your product slug.

Expected schema:

```json
[
{
"<produc_slug>": [
"<relative_path_1>",
"<relative_path_1>",
...
"<relative_path_N>",
]
}
]
-merged
```

For example:

```json
[
{
"vault": [
"/content/docs/updates/important-changes.mdx",
"/content/docs/updates/release-notes.mdx",
"/content/docs/updates/change-tracker.mdx"
]
}
]
```

## General workflow

Expand All @@ -133,27 +139,33 @@ Next, the script builds the following file sets:
- exclusions - a list of files the script should ignore during the sync
- GAΔ - files in the GA (current) docset with a last commit date later
than the provided cutoff date.
than the provided cutoff date.
- RCΔ - files in the RC (unreleased) docset with a last commit date
later than the provided cutoff date.
- GA-only - files in the GA (current) docset that do not exist in the RC
docset.
- GAd - files deleted from the GA (current) docset that still exist in
the RC (unreleased) docset

The script determines what to do with the files based on the following rubric
where GAu and RCu are the set of files unchanged since the cutoff in the GA and
RC docsets:

Set definition | Implication | Action
-------------------- | ------------------ | -------------------------
file ∈ { RCu ∧ GAu } | file unchanged | ignore
file ∈ { RCu ∧ GAΔ } | updated in GA only | safe to update in RC
file ∈ { RCΔ ∧ GAu } | updated in RC only | ignore
file ∈ { RCΔ ∧ GAΔ } | updated in both | possible conflict; needs manual review
file ∈ { RC ∧ !GA } | new file for RC | ignore
file ∈ { !RC ∧ GA } | new file for GA | safe to update in RC

If `-update` is `true`, the script slams files in the RC folder with files from
the GA folder with any file deemed "safe", prints a note to review the
information in the conflict file, and updates the last run date.
Set definition | Implication | Action
-------------------- | -------------------- | -------------------------
file ∈ { RCu ∧ GAu } | file unchanged | ignore
file ∈ { RCu ∧ GAΔ } | updated in GA only | safe to update in RC
file ∈ { RCΔ ∧ GAu } | updated in RC only | ignore
file ∈ { RCΔ ∧ GAΔ } | updated in both | possible conflict; needs manual review
file ∈ { RC ∧ !GA } | new file for RC | ignore
file ∈ { !RC ∧ GA } | new file for GA | safe to update in RC
file ∈ { RC ∧ GAd } | file deleted from GA | safe to delete in RC

If `-update` is `true`, the script creates a working branch, changes to that
branch, slams files in the RC folder with files from the GA folder with any file
deemed "safe", deletes files in RC that show up as deleted in the gitlog for GA,
prints a note to review the information in the conflict file, and updates the
last run date.

If `-update` is `false`, the script generates log files and exits.

Expand All @@ -171,6 +183,7 @@ but it also creates the following artifacts:
File set | Output file
------------------- | --------------------
GAΔ | output/ga-delta.txt
GAd | output/delete-list.txt
RCΔ | output/rc-delta.txt
GA-only | output/ga-only.txt
updated files | output/safe-list.txt
Expand Down
27 changes: 26 additions & 1 deletion scripts/sync-ga-to-rc/bash-helpers/definitions.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
myDir=$(pwd)
repoName="web-unified-docs"
localReposDir=${myDir%"/${repoName}"*}
outputDir="${myDir}/output"

repoRoot="${localReposDir}/${repoName}" # Local root directory of the repo
docRoot="${repoRoot}/content/<PRODUCT>" # Root directory of product docs
Expand All @@ -32,6 +33,30 @@ rcDocs="" # Set in helper from command line arguments; for example, "${docRoot
gaDocs="" # Set in helper from command line arguments; for example, "${docRoot}/v1.20.x"

jsonTemplate='{"file": "<FILENAME>", "shortname": "<SHORTNAME>", "commit": "<COMMIT>"}'
prBranch="bot/<PRODUCT>-ga-to-rc-sync-$(date +%Y%m%d)"
prBranch="bot/<PRODUCT>-ga-to-rc-sync-$(date +%Y%m%d-%H%M%S)"
prTitle="<PRODUCT> GA to RC auto-sync"
prBody="Draft PR created by \`sync-ga-to-rc.mjs\` to push recent GA updates to the RC release branch for <PRODUCT>"


# Helper function to convert an ISO time string to UTC
#
function getUTCDate {

local dateString="${1}"
local myShell="${SHELL}"
local zBash="/bin/zsh"
local uBash="/bin/bash"
local unixTime

# Bail if any of the command line parameters were omitted
if [[ -z "${dateString}" ]] ; then return; fi

# The date command in zbash (standard shell for MacOS) is wildly different
# from standard bash, so we convert differently based on the shell
if [[ "${myShell}" == "${zBash}" ]] ; then
unixTime=$(date -j -f '%Y-%m-%d %H:%M:%S %z' "${dateString}" +'%s')
echo $(date -j -u -r ${unixTime} +'%Y-%m-%d %H:%M:%S')
else
echo $(date -u +'%Y-%m-%d %H:%M:%S' -d "${dateString}")
fi
}
44 changes: 44 additions & 0 deletions scripts/sync-ga-to-rc/bash-helpers/delete-rc-docs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#
# Copyright (c) HashiCorp, Inc.
# SPDX-License-Identifier: BUSL-1.1
#
# ------------------------------------------------------------------------------
# Delete RC docs
#
# For every relative path in the input list, delete the RC version if it exists
#
# Expected usage: delete-rc-docs.sh <productKey> <gaFolder> <rcFolder> <deleteListFile>
# Example: delete-rc-docs.sh vault '1.20.x' '1.21.x (rc)' 'delete-list.txt'

# Pull in the common variable definitions
currDir="$(dirname "$0")"
. "${currDir}/definitions.sh"

# Set variables from command line argument
productKey="${1}" # root folder for product docs (product key)
gaFolder="${2}" # GA doc folder name
rcFolder="${3}" # RC doc folder name
deleteList="${4}" # file of GA paths we can overwrite in RC

# Bail if any of the command line parameters were omitted
if [[ -z "${productKey}" ]] ; then exit ; fi
if [[ -z "${gaFolder}" ]] ; then exit ; fi
if [[ -z "${rcFolder}" ]] ; then exit ; fi
if [[ -z "${deleteList}" ]] ; then exit ; fi

cd "${repoRoot}"

while read line; do

# Grab the filename and generate the cooresponding RC path
gaPath=$(echo "${line}" | awk -F " " '{print $3}')
rcPath=${gaPath/${gaFolder}/${rcFolder}}

# Skip any file that may have ended up in the list from a different product
if [[ "${gaPath}" != *"/content/${productKey}/"* ]]; then continue ; fi
if [[ "${rcPath}" != *"/content/${productKey}/"* ]]; then continue ; fi

# If the file exists in the RC folder, delete it
if [[ -f "${rcPath}" ]] ; then rm -r "${rcPath}" ; fi

done < "${outputDir}/${deleteList}"
72 changes: 72 additions & 0 deletions scripts/sync-ga-to-rc/bash-helpers/deleted-in-ga.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
#
# Copyright (c) HashiCorp, Inc.
# SPDX-License-Identifier: BUSL-1.1
#
# ------------------------------------------------------------------------------
#
# Get deleted files in GA
#
# Look for files deleted from GA so we can sync the delete to the RC docs.
# Check if the delete date is after the cutoff to avoide re-deleting pages that
# were previously reverted. If so, echo the details so the script can add it to
# the result array
#
# Expected usage: deleted-in-ga.sh <product> <gaBranch> <gaFolder> <cutoff>
# Example: deleted-in-ga.sh vault v1.20.x main '2025-10-01 12:34:21'

# Pull in the common variable definitions
currDir="$(dirname "$0")"
. "${currDir}/definitions.sh"

# Set variables from command line argument
productKey="${1}" # product slug
gaBranch="${2}" # GA doc branch
gaFolder="${3}" # folder for GA docs
cutoff="${4}" # cutoff date

# Bail if any of the command line parameters were omitted
if [[ -z "${productKey}" ]] ; then exit ; fi
if [[ -z "${gaFolder}" ]] ; then exit ; fi
if [[ -z "${gaBranch}" ]] ; then exit ; fi
if [[ -z "${cutoff}" ]] ; then exit ; fi

# Set the key path strings
docFolder="${docRoot/'<PRODUCT>'/${productKey}}/${gaFolder}" # Full path to the GA folder
filePath="content/${productKey}/${gaFolder}" # Relative path to the GA folder
pathPrefix=${docFolder/"${filePath}"/""} # Full path to the repo

cd "${repoRoot}"

git fetch origin

# Loop through the list of deleted files in the git log
IFS=$'\n'
for file in $(
git log \
--diff-filter=D \
--name-only \
--summary ${gaBranch} | \
grep "${filePath}"
); do

# The git log provides the relative path as the "name" but we want to record
# the full path
fullFilePath="${pathPrefix}${file}"
rawCommitDate=$(
git log --all -1 --pretty=format:%ad --date=iso -- "${fullFilePath}"
)

lastCommit=$(getUTCDate "${rawCommitDate}")

# If the last commit happened after the cutoff, add it to the results
# We check the last commit time to avoid repeatedly deleting files that
# the user may have reinstated since the last run

if [[ "${cutoff}" < "${lastCommit}" ]]; then
shortName=${fullFilePath/"${filePath}"/""}
jsonString=${jsonTemplate/'<FILENAME>'/"${fullFilePath}"}
jsonString=${jsonString/'<SHORTNAME>'/"${shortName}"}
jsonString=${jsonString/'<COMMIT>'/"${lastCommit}"}
echo ${jsonString}
fi
done
Loading
Loading