[REEF-2022] Fixed AzureBlockBlobFileSystem.GetChildren api and unit tests #1464
[REEF-2022] Fixed AzureBlockBlobFileSystem.GetChildren api and unit tests #1464
Conversation
@@ -162,13 +162,27 @@ public void DeleteDirectory(Uri directoryUri) | |||
public IEnumerable<Uri> GetChildren(Uri directoryUri) | |||
{ | |||
BlobContinuationToken blobContinuationToken = null; | |||
var path = directoryUri.AbsolutePath.Trim('/'); | |||
string path = directoryUri.AbsolutePath.Trim('/'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This path work is unnecessary. Just use directoryUri.Segments #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for the suggestion. #Resolved
/// </summary> | ||
BlobResultSegment ListBlobsSegmented(string prefix, bool useFlatListing, BlobListingDetails blobListingDetails, int? maxResults, | ||
BlobResultSegment ListBlobsSegmented(string containerName, string directoryName, bool useFlatListing, BlobListingDetails blobListingDetails, int? maxResults, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"directoryName" is confusing since this could actually be a path. I suggest using "relativeAddress" to match the blob API. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. #Resolved
|
||
Assert.Equal(allNames.Count(), blobs.Count()); | ||
|
||
// List files only in the sub-folder in the container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second sections of this test is identical to the first section of this test. Please consolidate to a helper method. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consolidated code into a common method. #Resolved
@@ -138,6 +139,43 @@ public void TestDeleteE2E() | |||
Assert.False(CheckBlobExists(blob)); | |||
} | |||
|
|||
[Fact(Skip = SkipMessage)] | |||
public void TestGetChildrenE2E() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also test listing containers. After all the IFileSystem should work like the local file system. Containers are just root directories from an outside observer. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! I added code to make it work for both containers and blobs. #Closed
Assert.Equal(expectedChildBlobNames.Count(), blobNames.Count()); | ||
} | ||
|
||
public void ValidateChildrenWithContainers(Uri rootUri, IEnumerable<string> expectedContainerNames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason for another method. This method is basically identical to ValidateChildenWithBlobs. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm .. they are different .. in how we extract out the blob names or container names as the case may be from the output uri. I could try to consolidate it even more but I am afraid that we lose readability (containers vs blobs). Lets chat offline if you are not convinced with the reasoning. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed to remove the duplicate method. #Resolved
|
||
public void ValidateChildenWithBlobs(Uri storageBlobUri, IEnumerable<string> expectedChildBlobNames) | ||
{ | ||
IEnumerable<Uri> blobs = _fileSystem.GetChildren(storageBlobUri).ToList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ToList() is unnecessary. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Got rid of it. #Resolved
IEnumerable<Uri> blobs = _fileSystem.GetChildren(storageBlobUri).ToList(); | ||
IEnumerable<string> blobNames = blobs.Select(b => b.LocalPath.Replace("/" + _container.Name + "/", string.Empty)); | ||
|
||
Assert.True(expectedChildBlobNames.All(blobName => blobNames.Contains(blobName)), "blobNames has elements that are not in expectedChildBlobNames"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CollectionAssert.AreEquivalent should be used instead. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont see AreEquivalent. I do see Assert.Equal() here but that requires that the two lists be in the same order since it uses CheckIfEnumerablesAreEqual .. which is not guaranteed in the lists that I have. I could go ahead and sort and then do this, but I felt that what I have is simple enough to follow. #Resolved
IEnumerable<string> blobNames = blobs.Select(b => b.LocalPath.Replace("/" + _container.Name + "/", string.Empty)); | ||
|
||
Assert.True(expectedChildBlobNames.All(blobName => blobNames.Contains(blobName)), "blobNames has elements that are not in expectedChildBlobNames"); | ||
Assert.Equal(expectedChildBlobNames.Count(), blobNames.Count()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary if CollectionAssert is used. #Closed
} | ||
|
||
// If not at the root folder, return all blobs within the container | ||
string containerName = segments[1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible out of bounds exception. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by changing the condition earlier by ensuring this line is not executed if segments.Count() <= 1. #Resolved
foreach (string uploadedBlobName in parentFileNames.Concat(folderfileNames)) | ||
{ | ||
CloudBlockBlob blob = _container.GetBlockBlobReference(uploadedBlobName); | ||
UploadFromString(blob, "hello"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these files cleaned up anywhere if the test fails and you want to rerun? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the Dispose() method cleans up the container and everything in it. #Resolved
// setup | ||
string[] parentFileNames = new string[] { "sample1", "sample2", "sample3" }; | ||
string[] folderfileNames = new string[] { "folder1/sample4", "folder1/sample5" }; | ||
string[] parentListNames = new string[] { "sample1", "sample2", "sample3", "folder1/" }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These three arrays and their naming conventions make if very difficult to follow your logic. Can this be simplified with one list and the use of select statements? Something like the following. At the very least change parentFileNames to a concrete list so the concat at 150 isn't necessary.
string[] fileNames = new string[] { "sample1", "sample2", "sample3", "folder1/sample4", "folder1/sample5" };
var rootFilenames = fileNames.GroupBy(f => f.Split('/')[0]).Select(g => g.Key);
ValidateChildenWithBlobs(_container.Uri, rootFilenames);
var folderBlobs = fileNames.Where(f => f.StartsWith("folder1");
ValidateChildenWithBlobs(folderUri, folderBlobs);
OR
string[] fileNames = new string[] { "sample1", "sample2", "sample3","folder1/sample4", "folder1/sample5" };
string[] expectedFolderChildren = new string[] { "folder1/sample4", "folder1/sample5" };
string[] expectedRootChildren = new string[] { "sample1", "sample2", "sample3", "folder1/" };
``` #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Took the second option that you proposed. #Resolved
// If not at the root folder, return all blobs within the container | ||
string containerName = segments[1]; | ||
string relativeAddress = string.Empty; | ||
if (segments.Count() > 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary check since we've already guaranteed length >= 1. #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed check. #Resolved
{ | ||
IEnumerable<Uri> blobs = _fileSystem.GetChildren(storageBlobUri); | ||
Assert.Equal( | ||
expectedChildBlobs.Select(uri => uri.AbsoluteUri).OrderBy(uri => uri), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Select shouldn't necessasry #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
System.Uri does not implement IComparer. So I had to get the AbsoluteUri (string) before doing the OrderBy. #Resolved
@markusweimer and @motus : This pull request has been approved. Please review as needed and merge to master. Thanks! |
9d4cb16
to
d8d82da
Compare
Pinging again for @markusweimer or @motus to please merge this pull request. The test failures that I see with the appveyor failures seem unrelated to this pull request. |
@sharathmalladi please give me a day or two. I really want to review and merge it, but I am in the middle of the huge #1466 now |
Pinging again for @markusweimer or @motus to please merge this pull request. There are other changes that are piling up that depend on this one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good! I am done with my round of nit picks 😄
P.S. a general comment: try to keep lines shorter than 120 characters (preferably below 100); trim trailing whitespaces; use CodeMaid or similar tool to fix these and other formatting issues for you
@@ -29,7 +29,7 @@ namespace Org.Apache.REEF.Driver.Evaluator | |||
internal class EvaluatorRequest : IEvaluatorRequest | |||
{ | |||
internal EvaluatorRequest() | |||
: this(0, 0, 1, string.Empty, Guid.NewGuid().ToString("N"), string.Empty, Enumerable.Empty<string>().ToList(), true, String.Empty) | |||
: this(0, 0, 1, string.Empty, Guid.NewGuid().ToString("N"), string.Empty, Enumerable.Empty<string>().ToList(), true, string.Empty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, why is it so complicated, with Enumerable
and such? Can't we just write, say
private static readonly string NoBatchId = Guid.NewGuid().ToString("N");
private static readonly string[] NoNodeNames = { };
internal EvaluatorRequest() : this(0, 0, 1, "", NoBatchId, "", NoNodeNames, true, "")
or, if you want more readability, maybe
internal EvaluatorRequest() : this(
number: 0,
megaBytes: 0,
core: 1,
rack: "",
NoBatchId,
runtimeName: "",
NoNodeNames,
relaxLocality: true,
nodeLabelExpression: "")
etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refined constructors by calling the relevant overrides in each place.
In reply to: 196612445 [](ancestors = 196612445)
// setup | ||
string[] fileNames = new string[] { "sample1", "sample2", "sample3", "folder1/sample4", "folder1/sample5" }; | ||
string[] expectedFolderChildren = new string[] { "folder1/sample4", "folder1/sample5" }; | ||
string[] expectedRootChildren = new string[] { "sample1", "sample2", "sample3", "folder1/" }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string arrays can be private readonly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C# local variables cannot be marked readonly. If the suggestion was to make these as fields of the TestAzureBlockBlobFileSystemE2E class instead: I feel like having to expand the scope just to make it readonly is not a great idea. Wont fix.
In reply to: 196644714 [](ancestors = 196644714)
ValidateChildren(rootUri, new List<Uri> { _container.Uri }); | ||
} | ||
|
||
public void ValidateChildren(Uri storageBlobUri, IEnumerable<Uri> expectedChildBlobs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be private
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -157,22 +157,58 @@ public void DeleteDirectory(Uri directoryUri) | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe
Task.WaitAll(directory
.ListBlobs(true)
.OfType<ICloudBlob>()
.Select(blob => blob.DeleteIfExistsAsync())
.ToArray());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var path = directoryUri.AbsolutePath.Trim('/'); | ||
|
||
// If at the root, return all containers | ||
if (segments.Count() <= 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better use segments.Length
instead of Linq
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
} | ||
|
||
blobContinuationToken = containerListing.ContinuationToken; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will raise an NPE if containerListing
can ever be null
. did you mean ?.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ | ||
throw new ArgumentException("Call to ListBlobsSegmented returned no results. Uri is invalid or does not have children.", nameof(directoryUri)); | ||
} | ||
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for else
block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. All files that were touched in this pr now have less than 120 characters per line.
In reply to: 196653152 [](ancestors = 196653152)
if (listing.Results != null) | ||
if (listing == null || listing.Results.Count() == 0) | ||
{ | ||
throw new ArgumentException("Call to ListBlobsSegmented returned no results. Uri is invalid or does not have children.", nameof(directoryUri)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line is quite long. our coding standard requires that lines should be no longer than 120 characters; we usually keep it below 100 for readability.
maxResults, | ||
continuationToken, | ||
blobRequestOptions, | ||
operationContext).GetAwaiter().GetResult(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.GetAwaiter().GetResult()
can be simply .Result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see some discussions on this topic and it appears the two have different responses if an exception is thrown: https://stackoverflow.com/questions/17284517/is-task-result-the-same-as-getawaiter-getresult
From that discussion, I wager that GetAwaiter().GetResult() is the better option to use. If you still disagree, we can take this offline and resolve this.
In reply to: 196655236 [](ancestors = 196655236)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point!
|
||
public ContainerResultSegment ListContainersSegmented(BlobContinuationToken continuationToken) | ||
{ | ||
return _client.ListContainersSegmentedAsync(continuationToken).GetAwaiter().GetResult(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replied on the other issue and I will follow the same resolution here once that is addressed.
In reply to: 196655252 [](ancestors = 196655252)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat!! Will test and merge now
Fixed the GetChildren api and added unit test.
JIRA:
[REEF-2022] https://issues.apache.org/jira/browse/REEF-2022