Skip to content
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.

Add metrics collection for committing partial changes, undoing the last commit, and opening an expanded commit msg editor #1685

Merged
merged 13 commits into from
Sep 11, 2018
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 25 additions & 13 deletions lib/models/repository-states/present.js
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import Remote from '../remote';
import RemoteSet from '../remote-set';
import Commit from '../commit';
import OperationStates from '../operation-states';
import {addEvent} from '../../reporter-proxy';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one thing I worry about is, we currently have some metrics tracking in git-shell-out-strategy. Which is basically as close to the "back end" as you can go. We also have tracking on the UI layer, which is useful for seeing how many users are clicking specific buttons and whatnot. If we are adding instrumentation to the middle layers of the application, how do we ensure we're not double counting events? Would it make more sense to move this instrumentation down a layer to keep it all in one place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question. @annthurium and I discussed it in person and I'll capture the conclusions here for posterity's sake --

While it's a great rule of thumb to keep instrumentation at the boundary layers of our application (UI or lowest layer), ultimately we want to insert tracking where the information we want to track lives. In this case, we want commit metadata including whether or not the commit is a partial commit, and for this we need to count the number of unstaged files. We have logic for determining the number of unstaged files in the Repository layer, and not in the GitShellOutStrategy (we try to keep GSOS methods very basic, simply fetching data from Git and making it available to the Repository layer).

So in the end, we have a bit of duplication in that we are collecting commit events in the Repository and incrementing a counter whenever there is a commit in the GitShellOutStrategy. But, we're collecting additional metadata in the Repository layer, and we're avoiding adding complexity to our GitShellOutStrategy with duplicated logic just for the sake of gathering metrics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, cool 😄 I was wondering about what we should do about this.

I'm 👍 on putting our addEvent() calls where we have the most information about the metric we're trying to collect, as opposed to doing contortions to bring the information to the same semantic layer of the codebase. We can always git grep addEvent to see what we're collecting and where as a sanity check, but I feel like it's be pretty messy to drill enough options around purely to be able to count the right things. 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing we could do to keep a handle on the usefulness and non-redundancy of the metrics we're collecting might be to assert about them in our "integration"-style tests. That'd help us see a view of activity we'd see from the specific, holistic user behaviors that we're asserting against there.


/**
* State used when the working directory contains a valid git repository and can be interacted with. Performs
Expand Down Expand Up @@ -233,17 +234,27 @@ export default class Present extends State {
Keys.branches,
],
// eslint-disable-next-line no-shadow
() => this.executePipelineAction('COMMIT', (message, options) => {
const opts = (!options || !options.coAuthors)
? options
: {
...options,
coAuthors: options.coAuthors.map(author => {
return {email: author.getEmail(), name: author.getFullName()};
}),
};
() => this.executePipelineAction('COMMIT', async (message, options) => {
// check with Ash that I retained the logic he was going for...
const coAuthors = options && options.coAuthors;
const opts = !coAuthors ? options : {
...options,
coAuthors: coAuthors.map(author => {
return {email: author.getEmail(), name: author.getFullName()};
}),
};
Copy link
Contributor Author

@kuychaco kuychaco Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smashwilson can you double check to see if I retained the logic you originally intended with this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, it looks good to me 👍

Default options to {}; if options contains coAuthors, translate it to an Array of email/name pairs.


return this.git().commit(message, opts);
await this.git().commit(message, opts);
const {unstagedFiles, mergeConflictFiles} = await this.getStatusesForChangedFiles();
const unstagedCount = Object.keys({...unstagedFiles, ...mergeConflictFiles}).length;
addEvent('commit', {
package: 'github',
partial: unstagedCount > 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there are several UI options for partial staging, it would be interesting to instrument those as well. Not something you have to implement here, just something to consider for future iterations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had that thought but then a couple of considerations caused me to pause. Maybe you have thoughts that would help with un-pausing :).

Do you have specific questions you'd like answered by this information or examples of conclusions we might be able to draw?

These were the considerations that caused me to pause initially:

  • as mentioned in the description, tracking all of the staging/unstaging events seems like it could be a lot of noise with little additional value. For example, keyboard users might stage files by hitting enter a bunch of times for each line/hunk/file they want to stage... so for a given partial commit, there could be a ton of staging events. Maybe this could provide valuable information, but nothing came to mind for me immediately
  • the distinction between partial staging as opposed to staging everything can't necessarily be inferred from instrumenting the UI controls. For example, clicking the "Stage selection" button could result in partial staging of a file, or it could result in staging all changes if the user has selected all of the changes before clicking the button.

💭 welcome if you have any!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about how users are staging files. Our file staging is extremely complex because we have many ways of performing the same action. For example, we have stage and unstage all both in the staged / unstaged changes pane and in the diff view for individual files. Ditto for hunk/line staging and unstaging.

If all of these are heavily used, great. If not, we should consider simplifying so we don't have so many different ways of performing the same operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, keyboard users might stage files by hitting enter a bunch of times for each line/hunk/file they want to stage... so for a given partial commit, there could be a ton of staging events.

We'd almost have to track a flag locally (somewhere in package state... ?) that recorded a bunch of aggregate information about a commit while it's being staged, then increment actual metrics when the commit is finalized here based on their final states.

For example, when a user clicks "Stage Selection" in the FilePatchItem, we know if the file still includes unstaged changes or not, so we can update a "commit includes partial file" flag to true or false; then, at commit time, we check the flag and send a partiallyStagedFile property on the final event.

I have no idea where those flags should live, though - we don't really have a model yet for "the commit under construction." Also the FilePatchItem is about to change drastically so... 🙃

If all of these are heavily used, great. If not, we should consider simplifying so we don't have so many different ways of performing the same operation.

Yeah, that's a great point. I'm not sure how we'd do that, though 🤔

amend: !!options.amend,
coAuthorCount: coAuthors ? coAuthors.length : 0,
});
// note: this is somewhat redundant to the `incrementCounter` call in `GitShellOutStrategy` on all git commands,
// here we get useful additional metadata
}, message, options),
);
}
Expand Down Expand Up @@ -344,6 +355,7 @@ export default class Present extends State {
async () => {
try {
await this.git().reset('soft', 'HEAD~');
addEvent('undoLastCommit', {package: 'github'});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for the counter names, we're using dash separation. Could you make this undo-last-commit instead?

} catch (e) {
if (/unknown revision/.test(e.stdErr)) {
// Initial commit
Expand Down Expand Up @@ -528,9 +540,9 @@ export default class Present extends State {
this.transitionTo('TooLarge');
return {
branch: {},
stagedFiles: [],
unstagedFiles: [],
mergeConflictFiles: [],
stagedFiles: {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for my own education: why did you change this function to return objects instead of arrays? (I'm sure there's a perfectly good reason, I just don't know what it is 😃 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh, yeah good question. it's not immediately clear.

This was to fix an oversight. I checked the format of the results returned from this function and objects are expected rather than arrays. Probably doesn't make a difference for the error case here since they're all empty objects anyways, but consistency is good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... Good catch 😄

unstagedFiles: {},
mergeConflictFiles: {},
};
} else {
throw err;
Expand Down
Loading