Skip to content

Add option to use hash in changelog filenames#996

Merged
ecraig12345 merged 4 commits intomasterfrom
ecraig/changelog-hash
Nov 6, 2024
Merged

Add option to use hash in changelog filenames#996
ecraig12345 merged 4 commits intomasterfrom
ecraig/changelog-hash

Conversation

@ecraig12345
Copy link
Copy Markdown
Member

@ecraig12345 ecraig12345 commented Oct 4, 2024

Add an option changelog.uniqueFilenames which adds an 8-character random suffix to the end of the changelog filename (before the extension): e.g. CHANGELOG-abcd1234.md.

EDIT: After discussion, the suffix is now the first 8 characters of the MD5 hash digest of the package name.

This will stay stable between commits, and any existing changelog files will be renamed when modified. Existing suffixed files should also be renamed if the package is renamed. (I'm guessing 8 characters is good enough, but we could also use 12 or more.)

This is one option for working around an issue with Git: its default hash algorithm only considers the last 16 characters of filenames, which can lead to collisions and inefficient packing when many files have similar names, which then leads to extreme growth in size for larger repos with many colliding changelog files.

Bonus: while adding the hash, I learned we can use crypto.randomUUID() for change file IDs and remove the uuid dependency.

(I also updated a couple things with test configuration: disable type checking in ts-jest since it's redundant and slow, and set launch.json to use Node 14 like the rest of the repo.)

Related to #978

@ecraig12345 ecraig12345 force-pushed the ecraig/changelog-hash branch from 174f592 to daa348b Compare October 4, 2024 23:46
Copy link
Copy Markdown
Collaborator

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. I'm not an expert in this codebase or Javascript, so take all of my comments with a grain of salt. I tried to focus on the highest level of functionality, which I believe your implementation will suffice. I prod in a couple of directions for your consideration, but do not consider them blockers.

Comment thread src/changelog/prepareChangelogPaths.ts Outdated
Comment thread src/changelog/prepareChangelogPaths.ts Outdated
@ecraig12345 ecraig12345 changed the title Add option to use unique suffix in changelog filenames Add option to use hash in changelog filenames Oct 8, 2024
if (existingSuffixedPaths[ext]) {
moveIfNeeded({ oldPath: existingSuffixedPaths[ext]!, newPath: defaultPath, hadSuffix: true });
// Generate a unique filename based on the package name hash.
const hash = crypto.createHash('md5').update(packageName).digest('hex').slice(0, 8);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this change. This will avoid the possibility of some corner cases with the random method. There may even be some benefits if a package is moved within the repo (without renaming the package).

let newestDate = 0;
let newestFile: string | undefined;

for (const file of fs.readdirSync(cwd)) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best practice would be to pass the { withFileTypes: true } option and test that the entry is a file.

Comment on lines +100 to +101
if (fs.existsSync(oldPath)) {
fs.renameSync(oldPath, newPath);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existence checking files before operating on them is generally an anti-pattern that is advised against by the authors of the file system APIs, since in theory the existence can change between the calls, so you have to handle the non-existence during the real call anyway. Simpler to just handle the ENOENT or ENOTDIR during the renameSync call.

@ecraig12345 ecraig12345 force-pushed the ecraig/changelog-hash branch from f484507 to 474c58c Compare November 6, 2024 22:21
@ecraig12345 ecraig12345 merged commit e349314 into master Nov 6, 2024
@ecraig12345 ecraig12345 deleted the ecraig/changelog-hash branch November 6, 2024 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants