Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib/git return type audit #6586

Closed
22 tasks
outofambit opened this issue Jan 10, 2019 · 14 comments
Closed
22 tasks

lib/git return type audit #6586

outofambit opened this issue Jan 10, 2019 · 14 comments
Labels
help wanted Issues marked as ideal for external contributors tech-debt Issues and pull requests related to addressing technical debt or improving the codebase

Comments

@outofambit
Copy link
Contributor

outofambit commented Jan 10, 2019

Many of our functions in lib/git have the return type Promise<void>. This makes it difficult for callers to know if the function succeeded or failed (unless it throws an error).

This is particularly important for cases like

const mergeSuccessful = await gitStore.merge(branch)
where the merge may fail (for expected reasons) be we need to have that information as the caller.

Goal

Determine the appropriate return type for each of these functions.

Akin to #5923, it may be appropriate for some functions to return a sha or some other information on success, but at many of these functions should at least return true or a generic "success" type to indicate to the caller that nothing went wrong. (we should have at the very least exit codes from git() for all of these functions to judge that by.)

Functions to Audit

(these are all functions with the return type Promise<void> in lib/git)

  • createMergeCommit
  • setRemoteURL
  • applyPatchToIndex
  • renameBranch
  • checkoutBranch
  • checkoutPaths
  • clone
  • setGlobalConfigValue
  • writeGitDescription
  • fetchRefspec
  • fetch
  • saveGitIgnore
  • initGitRepository
  • installGlobalLFSFilters
  • abortMerge
  • pull
  • push
  • revertCommit
  • unstageAllFiles
  • resetSubmodulePaths
  • stageFiles
  • updateRef

Notes

Any PRs that change these return types should also include tests that verify those types are returned in the right situations so we don't inadvertently cause regressions.

cc @shiftkey, as well as @say25 for additional thoughts since they've done prior work like this

@say25
Copy link
Member

say25 commented Jan 10, 2019

Seems reasonable to me. I wouldn't mind taking this on myself. There should be no downside to exposing more info as we wont be losing any context by no longer returning void.

@outofambit
Copy link
Contributor Author

@say25 awesome :) yeah if you want to take this on that sounds great. does one PR per file seem reasonable? or whatever chunking you think makes sense so long as it's not all one giant PR. 🏂

@shiftkey
Copy link
Member

@outofambit @say25 I have a few questions about this but am swamped with 1.6 stuff. Please hang on a bit before we dive into PRs.

@say25
Copy link
Member

say25 commented Jan 10, 2019

Sounds good. feel free to assign me in the meantime, dont have permissions to do so myself.

@shiftkey
Copy link
Member

Related to all this, a while ago we talked about the error handling for Git operations and making that API more user-friendly. I opened the experiment-isolate-error-handling with one potential solution and commented about it here but didn't hear back. If you think that's helpful for clarifying things, I can move it to a PR and we can talk further about it.

It's not the whole solution, but should help make more explicit what the application does with Git operations that might fail.

@shiftkey
Copy link
Member

shiftkey commented Jan 11, 2019

I'd like to unpack and distill the scope of the discussion before we proceed with PRs. I need to kind of jump around a bit to tease out some ideas, so apologies in advance.

(we should have at the very least exit codes from git() for all of these functions to judge that by.)

I'm still a huge fan of keeping these exit codes inside the functions and keeping the callers unaware of the implementation details of Git. Git exit codes are woeful to understand from a automation perspective like we do in Desktop, and I don't want these assumptions and duplicated error handling to be scattered around the codebase to introduce further confusion.

it may be appropriate for some functions to return a sha or some other information on success

I believe these changes should be driven by what the application requires from Git, and I worry about spending lots of time introducing changes to the API that aren't used and introduce new behaviour. We haven't talked about adding tests alongside these API changes to ensure they work as intended and aren't regressed in the future, which would increase the overall work required. So I'm nervous about broad sweeping changes without nailing down the problem we're trying to solve.

"What the application requires from Git" is an interesting question we haven't really gone into detail about, so I want to take a moment here to elaborate on what I think of when this comes up, as I think that's helpful here to share.

To use the const mergeSuccessful = await gitStore.merge(branch) code as an example: if I were writing this code and noticed this new scenario about detecting when a merge message should be displayed, I'd use a return value which adds context to the operation.

enum MergeResult {
  NewCommitCreated,
  AlreadyUpToDate
}

const mergeResult = await gitStore.merge(branch)
if (mergeResult.NewCommitCreated) {
  // do something
} else {
  // no need to display anything
}

The fact there's now two different success scenarios to consider means using a boolean might be confused with the case of an error occurring.

This feels a bit more verbose, but I think it has some big benefits:

  • the return value is more explicit than success/failure, and gives the reader (and probably the developer too) more context to express what the application should be aware of
  • we avoid showing the raw Git details here, and can adapt the API to what the application needs
  • we keep the Git-specific concerns within the function, which stays as the one place to maintain the handling of various scenarios and conversion to application-specific needs

many of these functions should at least return true or a generic "success" type to indicate to the caller that nothing went wrong

I feel like we're getting ahead of the problem here, and need to ask about what we're trying to address. Here are a couple of significant problems I've seen:

  • it's not clear whether a Git operation succeeds or errors because undefined is used to signal an error, but we also use null or undefined to signal the lack of a result
  • components calling Git operations should be able to tell when a problem occurred and can react accordingly

I think we can achieve these two things without needing to do broad changes to the Git APIs, while also keeping around the default error handling that catches unhandled exit codes and displays messages to the user for these situations.

There's a functional programming concept called a Result type which is a convenient way of unifying success or failure when calling a function. This has made it into many languages where union types are supported (and plenty of others can mimic this), and can be modelled in Typescript like this:

export type Result<T> = T | Error

This means we can change our error handling API which previously returned that cursed undefined:

  public async performFailableOperation<T>(
    fn: () => Promise<T>,
    errorMetadata?: IErrorMetadata
  ): Promise<T | undefined> {
    try {
      const result = await fn()
      return result
    } catch (e) {
      e = new ErrorWithMetadata(e, {
        repository: this.repository,
        ...errorMetadata,
      })

      this.emitError(e)
      return undefined
  }

To instead return a richer result:

  public async performFailableOperation<T>(
    fn: () => Promise<T>,
    errorMetadata?: IErrorMetadata
  ): Promise<Result<T>> {
    try {
      const result = await fn()
      return result
    } catch (e) {
      e = new ErrorWithMetadata(e, {
        repository: this.repository,
        ...errorMetadata,
      })

      this.emitError(e)
      return new Error(e)
  }

This is where it gets really interesting. I added this to the experiment-isolate-error-handling branch to play around with and it raised a lot of compiler errors where we previously baked null or undefined checks. This is a crutch I want to avoid us using with designing Git APIs, and now that we have a Result type I can add some type guards and check the result is a success or error without touching the underlying Git APIs themselves:

export function isError<T>(result: Result<T>): result is Error {
  return result instanceof Error
}

export function isSuccess<T>(result: Result<T>): result is T {
  return !isError(result)
}

And with these type guards, I can make the exceptional cases more explicit than they were before:

diff --git a/app/src/lib/stores/git-store.ts b/app/src/lib/stores/git-store.ts
index 7b4deaf99..36dfbbc82 100644
--- a/app/src/lib/stores/git-store.ts
+++ b/app/src/lib/stores/git-store.ts
@@ -77,6 +77,7 @@ import { GitAuthor } from '../../models/git-author'
 import { IGitAccount } from '../../models/git-account'
 import { BaseStore } from './base-store'
 import { createFailableOperationHandler } from './error-handling'
+import { Result, isError, isSuccess } from '../../models/result'
 
 /** The number of commits to load from history per batch. */
 const CommitBatchSize = 100
@@ -128,7 +129,7 @@ export class GitStore extends BaseStore {
   private readonly withErrorHandling: <T>(
     fn: () => Promise<T>,
     errorMetadata?: IErrorMetadata | undefined
-  ) => Promise<T | undefined>
+  ) => Promise<Result<T>>
 
   public constructor(repository: Repository, shell: IAppShell) {
     super()
@@ -173,7 +174,8 @@ export class GitStore extends BaseStore {
     const commits = await this.withErrorHandling(() =>
       getCommits(this.repository, range, CommitBatchSize)
     )
-    if (commits == null) {
+
+    if (isError(commits)) {
       return
     }
 
@@ -219,7 +221,8 @@ export class GitStore extends BaseStore {
     const commits = await this.withErrorHandling(() =>
       getCommits(this.repository, `${lastSHA}^`, CommitBatchSize)
     )
-    if (!commits) {
+
+    if (isError(commits)) {
       return
     }
 
@@ -247,7 +250,8 @@ export class GitStore extends BaseStore {
     )
 
     this.requestsInFight.delete(requestKey)
-    if (!commits) {
+
+    if (isError(commits)) {
       return null
     }
 
@@ -269,14 +273,16 @@ export class GitStore extends BaseStore {
       ),
     ])
 
-    if (!localAndRemoteBranches) {
+    if (isError(localAndRemoteBranches)) {
       return
     }
 
     this._allBranches = this.mergeRemoteAndLocalBranches(localAndRemoteBranches)
 
     this.refreshDefaultBranch()
-    this.refreshRecentBranches(recentBranchNames)
+    if (isSuccess(recentBranchNames)) {
+      this.refreshRecentBranches(recentBranchNames)
+    }
 
     const commits = this._allBranches.map(b => b.tip)
 
@@ -378,10 +384,8 @@ export class GitStore extends BaseStore {
     return 'master'
   }
 
-  private refreshRecentBranches(
-    recentBranchNames: ReadonlyArray<string> | undefined
-  ) {
-    if (!recentBranchNames || !recentBranchNames.length) {
+  private refreshRecentBranches(recentBranchNames: ReadonlyArray<string>) {
+    if (recentBranchNames.length === 0) {
       this._recentBranches = []
       return
     }
@@ -440,7 +444,7 @@ export class GitStore extends BaseStore {
       return
     }
 
-    let localCommits: ReadonlyArray<Commit> | undefined
+    let localCommits: Result<ReadonlyArray<Commit>>
     if (branch.upstream) {
       const range = revRange(branch.upstream, branch.name)
       localCommits = await this.withErrorHandling(() =>
@@ -455,7 +459,7 @@ export class GitStore extends BaseStore {
       )
     }
 
-    if (!localCommits) {
+    if (isError(localCommits)) {
       return
     }
 
@@ -500,10 +504,9 @@ export class GitStore extends BaseStore {
       getStatus(this.repository)
     )
 
-    if (status == null) {
-      throw new Error(
-        `Unable to undo commit because there are too many files in your repository's working directory.`
-      )
+    if (isError(status)) {
+      log.error('undoFirstCommit was not able to read git status', status)
+      return
     }
 
     const paths = status.workingDirectory.files
@@ -871,7 +874,7 @@ export class GitStore extends BaseStore {
       getStatus(this.repository)
     )
 
-    if (!status) {
+    if (isError(status)) {
       return null
     }
 
@@ -920,7 +923,7 @@ export class GitStore extends BaseStore {
       getCommit(this.repository, sha)
     )
 
-    if (foundCommit != null) {
+    if (isSuccess(foundCommit)) {
       this.commitLookup.set(sha, foundCommit)
       return foundCommit
     }
@@ -1095,7 +1098,7 @@ export class GitStore extends BaseStore {
   }
 
   /** Merge the named branch into the current branch. */
-  public merge(branch: string): Promise<boolean | undefined> {
+  public merge(branch: string) {
     return this.withErrorHandling(() => merge(this.repository, branch), {
       gitContext: {
         kind: 'merge',

Making this change also highlighted some Git functions which returned T | null or T | undefined, and it's those APIs I'd love to fix sooner rather than later (by making them throw an error explicitly for error cases or to define the return type better to truly indicate no result was found, and then making application code more aware of these results).

To wrap up this long comment, these are the things I'd love to have front-and-center when we're working with our Git APIs:

  • functions should return values application-specific shapes where needed, and avoid exposing Git internals
  • it should be easy for callers to check success and failure no matter what the API returned
  • error handling should be easy to use when invoking Git functions

Let me know if you have any questions or things to add about the things I raised and this alternate direction.

@outofambit
Copy link
Contributor Author

@shiftkey

(we should have at the very least exit codes from git() for all of these functions to judge that by.)

I'm still a huge fan of keeping these exit codes inside the functions and keeping the callers unaware of the implementation details of Git.

I am not suggesting we expose error codes to callers of lib/git functions. I agree these should be abstracted. I’m indicating we have success information inside every lib/git function that we are throwing away

There's a functional programming concept called a Result type which is a convenient way of unifying success or failure when calling a function.

I am familiar with result types. It is one potential solution to the problem stated in this issue. (And I'm generally in favor of it.)

Let me know if you have any questions or things to add about the things I raised and this alternate direction.

This alternate direction doesn't solve the core problem that we are throwing away information at the lib/git level. (It solves a different, but related problem in the gitStore API.) In fact, I think they are complements and not mutually exclusive. (@shiftkey would you open a separate issue to discuss the gitStore API concerns?)

For example, in

export type Result<T> = T | Error

T would be void for all of the functions listed in this issue, which is not an informative result, even if you can trust that any unexpected errors would have been returned. Again the core problem is that we are throwing away information unnecessarily before we even get to the gitStore level.

Given all that, I see no reason to not proceed with this audit. At the very least, there's no harm in seeing if there's information we can return from each of these. At the most, this will prevent future churn and friction in feature development that touches lib/git. In fact, this issue would make the suggested work in gitStore more valuable.

cc @desktop/engineering and @desktop/comrades in case others want to weigh in

@shiftkey
Copy link
Member

I'm still nervous about this bit:

We haven't talked about adding tests alongside these API changes to ensure they work as intended and aren't regressed in the future, which would increase the overall work required.

Because of things like #6187 which was a result of not adding tests to #6119. What should we submitting as part of making an change inside a Git function @outofambit?

@shiftkey shiftkey added the tech-debt Issues and pull requests related to addressing technical debt or improving the codebase label Jan 11, 2019
@outofambit
Copy link
Contributor Author

That's a great shout @shiftkey. Let's agree that any of these functions we change return types for should be tested in the same PR. I'll update the issue description to communicate that.

@outofambit
Copy link
Contributor Author

from @iAmWillShepherd #6606 (comment)

That makes sense, but could we create a type for that so we get back Success or Failure where Failure is an object that has some other info we may need to handle the failure?

+1

@shiftkey shiftkey added ready-for-review Pull Requests that are ready to be reviewed by the maintainers help wanted Issues marked as ideal for external contributors and removed ready-for-review Pull Requests that are ready to be reviewed by the maintainers labels Jan 16, 2019
@outofambit
Copy link
Contributor Author

@say25 what are your thoughts on using a success/failure result type? is that something you'd be interested in proposing a solution for?

@say25
Copy link
Member

say25 commented Jan 25, 2019

It makes sense to me. Maybe if you or a core team member knocked out the first one I could do the rest?

@shiftkey
Copy link
Member

shiftkey commented Feb 3, 2019

I stumbled upon #2241 just now which feels related to this issue

@outofambit
Copy link
Contributor Author

this might be worth revisiting (post 1.7.0) since we experienced some pain (and lost time debugging) over in #7474, due to returning Promise<void> into gitStore.performFailableGitOperation

i made a quick fix by returning true at the end of checkout which might be a decent first iteration here just to alleviate this potential gotcha.

@say25 say25 removed their assignment Sep 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Issues marked as ideal for external contributors tech-debt Issues and pull requests related to addressing technical debt or improving the codebase
Projects
None yet
Development

No branches or pull requests

3 participants