New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Commit data #4641
Commit data #4641
Conversation
I was wondering if it makes sense to split the commit message from any notes. Currently they're both merged into the |
Codecov Report
@@ Coverage Diff @@
## master #4641 +/- ##
==========================================
+ Coverage 26.68% 26.69% +0.01%
==========================================
Files 501 501
Lines 40752 40713 -39
Branches 5961 5957 -4
==========================================
- Hits 10873 10869 -4
+ Misses 29379 29344 -35
Partials 500 500
Continue to review full report at Codecov.
|
/// <returns><c>true</c> if the command detailed an error, otherwise <c>false</c>.</returns> | ||
public static bool IsGitErrorMessage(string gitOutput) | ||
{ | ||
return Regex.IsMatch(gitOutput, @"^\s*(error:|fatal)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it's more efficient to have static readonly instances of Regex created with .Compiled option whenever possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's times like this I wish C# had static locals, like C++ does. Seems a shame to introduce a type-level field for something that's only used in one member. We now have nested functions. Why not nested static fields too?
I'd be interested in some benchmark results here, as well as info on runtime compilation times (consider it might cache DynamicMethod
internally or similar). I suspect the difference is negligible, especially for this regex pattern with little seeking about. And there's a trade-off: the compilation of such static Regex objects would happen when their containing type first loads, meaning more latency when GE starts up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for following up. Given we're talking nanos (not even micros) I don't think there's much to do here. Interesting to see the difference all the same. Allocation stats are interesting too. Will keep that in mind for future.
Are there any tight loops in GE? I can't think of any off the top of my head from what I've seen so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah.. I was glad and disappointed at the same time that execution took only nanos :D CPU perf wise this is absolutely not an issue, so only mem might be of a concern (not in this case though).
At some future time I'll do some benchmarking with proper tools to see where actual perf-hogs are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RussKie, sorry I missed your comment.
Dealing with any code base is like playing chess. There is always a move :) One must think both short and long term.
Now that I've spent a bit of time with GE I feel like my first proper contribution will be around consolidating git process execution, and the strategy for GitModule
refactoring (#4643). I see a series of moves that get us from where we are now to a point where the code is clearer, with better abstractions and is more testable. The design and coding is the easy part :) The effort on my part is to communicate effectively and ensure the solution meets the goals of the team, including the goals I don't yet see.
I have a working implementation of the git process execution code and would like to get it in front of you all. It's blocked by the few open PRs here (#4641, #4639 and especially #4662), so I want to do whatever I can to get these merged or closed and move forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to get to them as soon as can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The effort on my part is to communicate effectively and ensure the solution meets the goals of the team, including the goals I don't yet see.
👍
There are few things which are important to me in any project, applicable to both UI and API - consistency, usability, readability, testability and maintainability. These are interconnected and compliment one another.
SOLID principles seem to help a lot in achieving the above, hence I'm so obsessed with them :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dealing with any code base is like playing chess. There is always a move :) One must think both short and long term.
Yes, however OS has its caveats - if at (paid) work one has a greater degree of commitment and hence can afford greater degree of trust in promises of future delivery, in the OS world the degree of commitment is lower - family, paid work and other real life commitments have higher priorities (understandably so). People also lose interest...
In this case a promise of future refactor or rework may never be fulfilled, and the codebase would left with a rot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
People also lose interest...
I've run open source projects for a long while and know how it goes. GitHub is filled with well intentioned half-promises to look into things and, well, life gets in the way or it turns out to be harder than expected, or whatever.
In my case I hope you can tell that I'm at least a bit invested in and committed to this project. How long that will last, I don't know. But for now I have energy and time.
But I don't think that matters. These PRs all improve the code along at least one of the dimensions you mentioned, leaving things in better shape than they were beforehand. I don't see any unfulfilled promise that would lead to rot, but maybe I suffer from a biased perspective.
GE isn't very testable right now. I'm happy to help kick things in that direction.
Oh and thanks for your kind words earlier. GE is also extremely lucky to have someone as committed as you taking the lead and driving things forward. I know how much time it takes.
public CommitData CreateFromRevision(GitRevision revision) | ||
{ | ||
if (revision == null) | ||
{ | ||
throw new ArgumentNullException(nameof(revision)); | ||
} | ||
|
||
CommitData data = new CommitData(revision.Guid, revision.TreeGuid, revision.ParentGuids.ToList().AsReadOnly(), | ||
return new CommitData(revision.Guid, revision.TreeGuid, revision.ParentGuids.ToList().AsReadOnly(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you are optimizing.. string.Format could be replaced with string.Concat. string.Format has a rather high overhead of parsing a template. In this particular case strings can simply be concatenated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did some BenchmarkDotNet measurements and the differences aren't as high as I thought. string.Concat is faster but by only ~20%. 254ms vs 308ms in my specific test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Differences as percentages for this kind of thing seem more significant than they are in reality. A handful of milliseconds difference for a million operations isn't worth worrying about IMO. For that many operations, you're churning so much memory that you have other problems. Go for whatever is easiest to read. In this case, I would convert the lines just beneath your comment to string interpolation. There are many places in the current code that I'd use interpolation just for the clarity it brings, but that's another PR for another day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your position for high level code / code which is executed relatively rarely. But for code which is executed 1000s of times every single optimization counts because these small seemingly unimportant details add up.
In this specific case it seems I've made a mistake assuming this would be executed on all parsed history revisions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go for whatever is easiest to read
Amen. I choose readability and maintainability over optimisation unless it is proven to be a bottle neck
try profiling the load of the settings dialog - VS shows half a dozen of GC
runs when the dialog is being opened...
…On 17 Mar 2018 9:14 am, "Donatas" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In GitCommands/Git/GitModule.cs
<#4641 (comment)>
:
> @@ -3859,5 +3858,15 @@ public bool StopTrackingFile(string filename)
{
return RunGitCmdResult("rm --cached " + filename).ExitedSuccessfully;
}
+
+ /// <summary>
+ /// Determines whether a git command's output indicates an error occurred.
+ /// </summary>
+ /// <param name="gitOutput">The output from the git command, to inspect.</param>
+ /// <returns><c>true</c> if the command detailed an error, otherwise <c>false</c>.</returns>
+ public static bool IsGitErrorMessage(string gitOutput)
+ {
+ return Regex.IsMatch(gitOutput, @"^\s*(error:|fatal)");
Yeah.. I was glad and disappointed at the same time that execution took
only nanos :D CPU perf wise this is absolutely not an issue, so only mem
might be of a concern (not in this case though).
At some future time I'll do some benchmarking with proper tools to see
where actual perf-hogs are.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4641 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEMyXv_oNPuFLkUPgyD0KSQtqcU1SCRLks5tfDkpgaJpZM4StE9s>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to do partial squash(es) of benign/non functional/uninteresting changes (e.g. typos, decorations etc) to reduce the visual noise on the trunk?
@@ -3,7 +3,7 @@ | |||
|
|||
namespace GitCommands | |||
{ | |||
public class CommitData | |||
public sealed class CommitData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious as to why you chose to seal DTO/POCO?
I typically seal behavioural types since I don't like inheritance and can't guarantee that subclasses would not break the expected behaviours. However I leave POCOs open for extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to seal everything that doesn't have subclasses, even in library code. Once someone uses your API, you can unseal without breaking code, but not the other way around.
I rarely use inheritance when extending capabilities. Composition tends to be more flexible and powerful. When I do use inheritance the base class would have something virtual/abstract most of the time.
While there's no agreed upon definition of POCO, I generally think of a type that doesn't have any real behaviour, nor participate in any inheritance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
My experience is different. When you have a single version of software - it may be acceptable to rework the API. However it is a completely different story when you have multiple versions of API consumed by numerous projects...
GitCommands/CommitDataManager.cs
Outdated
@@ -15,17 +16,20 @@ public interface ICommitDataManager | |||
/// <param name="data">Data produced by a <c>git log</c> or <c>git show</c> command where <c>--format</c> | |||
/// was provided the string <see cref="CommitDataManager.LogFormat"/>.</param> | |||
/// <returns>CommitData object populated with parsed info from git string.</returns> | |||
CommitData CreateFromFormatedData(string data); | |||
[NotNull] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of expect to be laughed at for a noob question, but - what do we get with these attribute decorations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a minimum, these attributes serve as documentation of intent around whether values can or can not be null.
If you use R#, then they enable warnings about potential NullReferenceExceptions, (eg. passing null to a method that says it cannot be null, or using a nullable return value without checking first), and redundant code (eg. checking something for null that can never be null).
The act of annotating methods also makes you think a little more about a method's behaviour, or a field's invariants.
Hopefully in C# 8 support for nullable reference types will land, then the attributes can be converted to actual language constructs.
That is:
// before
[Nullable] private Bar _bar;
[NotNull] private Baz _baz;
[Nullable] public string Foo([NotNull] string s) { ... }
// after
private Bar? _bar;
private Baz _baz;
public string? Foo(string s) { ... }
Annotating now will help with this migration later.
The above covers [Nullable]
and [NotNull]
annotations, which are the most common. There are also:
[Pure]
for a method with no observable side effects (can be hoisted, and return value should not be discarded)[InstantHandle]
forIEnumerable<T>
and delegate parameters that are fully evaluated before the method returns[ItemNotNull]
and[ItemNullable]
for collection andTask<T>
types[ContractAnnotation]
that specifies how inputs and outputs relate to each other (eg. if the input isnull
the method will returnfalse
, allowing R# to flag something likeif (!FalseIfNull(notNullValue)) { /* unreachable code */ }
and so forth)
You can read more about them here.
All these attributes are defined conditionally meaning they do not end up in the compiled binary, so there's no overhead to using them.
GitCommands/CommitDataManager.cs
Outdated
@@ -206,6 +179,25 @@ public CommitData CreateFromRevision(GitRevision revision) | |||
revision.Body ?? revision.Subject); | |||
} | |||
|
|||
[ContractAnnotation("=>false,error:notnull,data:null")] | |||
[ContractAnnotation("=>true,error:null,data:notnull")] | |||
private bool TryGetCommitLog([NotNull] string commitId, [NotNull] string format, out string error, out string data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would move it after GetModule()
to help with A-Z order and over maintainability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll move it.
My rationale for that positioning: It's a private method only used by the two methods immediately before it. They're logically related to one another. GetModule
is in turn used by that, so introduced later still. When there are only a few members to a type, I tend to order them top to bottom. I've seen other codebases (three.js comes to mind) where members are ordered bottom to top, so as you read down the class you see the definitions before they're used.
Linqpad ROCKS.
…On Sat, Mar 17, 2018, 10:00 AM Donatas ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In GitCommands/Git/GitModule.cs
<#4641 (comment)>
:
> @@ -3859,5 +3858,15 @@ public bool StopTrackingFile(string filename)
{
return RunGitCmdResult("rm --cached " + filename).ExitedSuccessfully;
}
+
+ /// <summary>
+ /// Determines whether a git command's output indicates an error occurred.
+ /// </summary>
+ /// <param name="gitOutput">The output from the git command, to inspect.</param>
+ /// <returns><c>true</c> if the command detailed an error, otherwise <c>false</c>.</returns>
+ public static bool IsGitErrorMessage(string gitOutput)
+ {
+ return Regex.IsMatch(gitOutput, @"^\s*(error:|fatal)");
@RussKie <https://github.com/russkie> that's LINQPad. I use it often for
any kind of quick test or where I would normally create a console app for
something small. Free version is severely limited compared to Premium but I
don't regret a tiny bit buying it during some sale. Twitters are
unanimously positive too https://twitter.com/search?q=%23linqpad&src=typd
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#4641 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADdhsRVC56SCRYI7aZWLeYnarYBDwcuqks5tfRcVgaJpZM4StE9s>
.
|
I'll try profiling the settings panel in dotMemory and see what comes back. |
Sure, I'll squash some stuff. Would you be against just squashing the whole PR? |
This is a combination of 9 commits: - Use params overload - Preconditions first - Inline temporaries - Use IReadOnlyList<T> over ReadOnlyCollection<T> - Avoid temporary string and repeat lookup - Reorder properties - Seal DTO - Replace consts with named arguments - Reorder members to place consts before fields
Old test (IndexOf + 40) meant (index < 0) could never trigger. Also (IndexOf + 40 > Length) makes little sense. Just need to verify that output string contains the requested SHA-1.
The sha1 parameter must match CommitData.Guid. There's no point passing it. It only creates opportunity for bugs.
Note this shows how many callers ignore any error message.
Also rename UpdateCommitMessage to UpdateBody, as the actual property updated on CommitInfo is Body.
Changes done. |
The PR contains a number of changes - some functional, some aren't. Squashing them all into the same commit doesn't feel right. |
@RussKie I did actually squash quite a few of the commits here down as far as I could go without hitting conflicts big enough to discourage me from spending a lot of time on it. I get where you're coming from and will endeavour to make fewer commits in future. For my own projects I don't worry, and usually appreciate small commits when discovering bugs and bisecting. But if you're concerned about noise in the history, I'll go coarser.
I ran a memory profile while opening settings and the majority of the heap allocations were strings related to XML parsing. If there's anything else you need from this PR, please let me know. |
Review of
CommitData
andCommitDataManager
classes.Changes proposed in this pull request:
GitModule.IsGitErrorMessage
method for standard detection of error messages in output stringsWhat did I do to test the code and ensure quality:
Has been tested on: