Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions pinot-integration-tests/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -279,5 +279,10 @@
<artifactId>pinot-batch-ingestion-spark-3</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.pinot</groupId>
<artifactId>pinot-adls</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.pinot.integration.tests.filesystem;

import java.net.URI;
import java.net.URISyntaxException;
import org.apache.pinot.plugin.filesystem.ADLSGen2PinotFS;
import org.apache.pinot.spi.env.PinotConfiguration;
import org.apache.pinot.spi.filesystem.PinotFS;
import org.testng.annotations.Test;


public class ADLSPinotFSClientTest extends BasePinotFSTest {
private static final String ACCOUNT_NAME = "accountName";
private static final String ACCESS_KEY = "accessKey";
private static final String FILE_SYSTEM_NAME = "fileSystemName";
Comment on lines +30 to +32
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't these already defined elsewhere?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing it up. These are private constants in the respective FS implementations, we can think of making them public instead of duplicating it.

public static final String ADLS_ACCOUNT_NAME = "ADLS_ACCOUNT_NAME";
public static final String ADLS_ACCESS_KEY = "ADLS_ACCESS_KEY";
public static final String ADLS_FILE_SYSTEM_NAME = "ADLS_FILE_SYSTEM_NAME";
public static final String ADLS_FS_URI = "ADLS_FS_URI";
public static final String ADLS_ENABLE_FS_TESTS = "ADLS_ENABLE_FS_TESTS";

@Override
protected PinotFS getPinotFS() {
return new ADLSGen2PinotFS();
}

@Override
protected PinotConfiguration getFsConfigs() {
PinotConfiguration configuration = super.getFsConfigs();
configuration.setProperty(ACCOUNT_NAME, getEnvVar(ADLS_ACCOUNT_NAME));
configuration.setProperty(ACCESS_KEY, getEnvVar(ADLS_ACCESS_KEY));
configuration.setProperty(FILE_SYSTEM_NAME, getEnvVar(ADLS_FILE_SYSTEM_NAME));
return configuration;
}

@Override
protected URI getBaseDirectoryUri()
throws URISyntaxException {
String adlsUri = getEnvVar(ADLS_FS_URI);
return new URI(adlsUri + (adlsUri.endsWith("/") ? "" : "/") + "fsTest/" + _uuid);
}

@Override
protected boolean disableTests() {
// only run the tests when ADLS_ENABLE_FS_TESTS is specifically set to true
return !Boolean.parseBoolean(getEnvVar(ADLS_ENABLE_FS_TESTS));
}

@Override
@Test(enabled = false)
public void testLength() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any plans to have some uniformity in the way this interface has been implemented ?
Do we aim to change the ADLS implementation to also throw an exception when we get 0 in case of directory ?

I recall making a change for fs.delete in case of GCS when the directory or file is not present.
This divergent behavior lead to issues in segment deletion manager.

I understand it might be beyond of the scope of this PR though.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this PR is mostly aimed to surface the inconsistencies in FS implementations. We can have follow-up PRs to address these inconsistencies as they would be more involved based on special handling done at places like the example you gave above.
If any fs is unable to satisfy the interface requirement efficiently, it should call it out.

// test fails as interface expects the FS client to throw exception when PinotFS.length() is called on a directory,
// while the ADLS client returns 0.
}

@Override
@Test(enabled = false)
public void testOpen() {
// test fails as interface expects the FS client to throw an IOException when
// PinotFS.open() is called on non existent file,
// while the ADLS client throws a BlobStorageException which is a RuntimeException.
Comment on lines +76 to +78
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a bug that needs fixing?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes these are the existing inconsistencies that we are surfacing/auditing through these tests. Fixes for these cases can follow after this.

}

@Override
@Test(enabled = false)
public void testTouch() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. Why can't the implementation be tweaked a bit to check whether the path exists or not and if not, create an empty file.

I guess the interface follows the default s3 behavior.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As explained above, these tests are just auditing the existing gaps. Fixes can be in separate PRs with more involved discussions around the underlying cloud limitations and impact.

// test fails as interface expects the FS client to create an empty file when
// PinotFS.touch() is called on a non existent path, while the ADLS client throws IOException.
}
}
Loading
Loading