Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] The specified path does not exist.", 404, GET #1277

Closed
0xdarkman opened this issue Jul 17, 2022 · 30 comments
Closed

[BUG] The specified path does not exist.", 404, GET #1277

0xdarkman opened this issue Jul 17, 2022 · 30 comments
Assignees
Labels
question Questions on how to use Delta Lake

Comments

@0xdarkman
Copy link

I do stream with spark with scala from kafka >> process stream with job 1 and write to DeltaTable 1 >> process stream with job 2 and write stream to Delta Table 2.

The job2 runs for a while but then it fails because of the error below: The specified path does not exist.", 404, GET

It appends to DeltaTable so I do not understand why it gets 404.

By the way, using abfss ---> abfss does not help.

scalaVersion := "2.12.12"
sparkVersion = "3.2.1"
hadoopVersion = "3.3.0"

"com.microsoft.azure" % "azure-storage" % "8.6.6",
"io.delta" %% "delta-core" % "1.2.1",

destination storage account gen2

job1: reads from kafka and writes to DeltaTable1 using wasbs schema and blob.core.windows.net endpoint.
job2: reads from DeltaTable 1 using wasbs schema and blob.core.windows.net endpoint and writes to DeltaTable 2 using abfss schema and dfs.core.windows.net endpoint.

when I write stream I do use "append" mode and paritionBy multiple columns

df
  .writeStream
  .outputMode("append")
  .partitionBy(partitioningCols: _*)
  .format("delta")
  .option("mergeSchema", "true")
  .option("checkpointLocation", checkpointLocation)
  .start(tablePath)

The intended behaviour would be for the job not to fail.

Spark uses this configuration for auth:
spark.conf.set(s"fs.azure.account.key.$accountName.blob.core.windows.net", accountKey)

What is the problem here and how to fix it?

Caused by: Operation failed: "The specified path does not exist.", 404, GET, https://raddsstatsstorage.dfs.core.windows.net/stats-prod?upn=false&resource=filesystem&maxResults=500&directory=1h/per_customer_fqdn/_delta_log&timeout=90&recursive=false, PathNotFound, "The specified path does not exist. RequestId:6733a6ac-d01f-0074-18aa-99c477000000 Time:2022-07-17T06:54:17.9766705Z"
at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:146)
at org.apache.hadoop.fs.azurebfs.services.AbfsClient.listPath(AbfsClient.java:225)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:704)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:666)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:360)
... 58 more
22/07/17 06:54:18 INFO ShutdownHookManager: Shutdown hook called
22/07/17 06:54:18 INFO ShutdownHookManager: Deleting directory /tmp/spark-b04ed4ee-f724-45e7-b724-0c902c6de8b1
22/07/17 06:54:18 INFO ShutdownHookManager: Deleting directory /var/data/spark-f5e6d40d-907a-46d3-87b9-33cb85d7eb32/spark-dc28d981-c1b3-4578-be20-ada1a71003fd
22/07/17 06:54:18 INFO MetricsSystemImpl: Stopping azure-file-system metrics system...
22/07/17 06:54:18 INFO MetricsSystemImpl: azure-file-system metrics system stopped.
22/07/17 06:54:18 INFO MetricsSystemImpl: azure-file-system metrics system shutdown complete.

Caused by: Operation failed: "The specified path does not exist.", 404, GET, https://raddsstatsstorage.dfs.core.windows.net/stats-prod?upn=false&resource=filesystem&maxResults=500&directory=1h/per_customer_fqdn/_delta_log&timeout=90&recursive=false, PathNotFound, "The specified path does not exist. RequestId:6733a6ac-d01f-0074-18aa-99c477000000 Time:2022-07-17T06:54:17.9766705Z"
at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:146)
at org.apache.hadoop.fs.azurebfs.services.AbfsClient.listPath(AbfsClient.java:225)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:704)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:666)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:360)
... 58 more

22/07/17 06:54:17 INFO OptimisticTransaction: [tableId=bf7d1277,txnId=c48d284b] Committed delta #3080 to abfss://stats-prod@raddsstatsstorage.dfs.core.windows.net/1h/per_customer_fqdn/_delta_log
22/07/17 06:54:17 INFO DeltaLog: Try to find Delta last complete checkpoint before version 3080
22/07/17 06:54:17 ERROR MicroBatchExecution: Query [id = 0147f9bb-9b4a-4a28-b364-d4cf15a02efa, runId = 649875a1-4ffa-4cb1-9dd5-6c8f429ec907] terminated with error
java.io.FileNotFoundException: Operation failed: "The specified path does not exist.", 404, GET, https://raddsstatsstorage.dfs.core.windows.net/stats-prod?upn=false&resource=filesystem&maxResults=500&directory=1h/per_customer_fqdn/_delta_log&timeout=90&recursive=false, PathNotFound, "The specified path does not exist. RequestId:6733a6ac-d01f-0074-18aa-99c477000000 Time:2022-07-17T06:54:17.9766705Z"
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.checkException(AzureBlobFileSystem.java:1074)
at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:363)
at io.delta.storage.HadoopFileSystemLogStore.listFrom(HadoopFileSystemLogStore.java:59)
at org.apache.spark.sql.delta.storage.LogStoreAdaptor.listFrom(LogStore.scala:362)
at org.apache.spark.sql.delta.storage.DelegatingLogStore.listFrom(DelegatingLogStore.scala:125)
at org.apache.spark.sql.delta.Checkpoints.findLastCompleteCheckpoint(Checkpoints.scala:233)
at org.apache.spark.sql.delta.Checkpoints.findLastCompleteCheckpoint$(Checkpoints.scala:224)
at org.apache.spark.sql.delta.DeltaLog.findLastCompleteCheckpoint(DeltaLog.scala:64)
at org.apache.spark.sql.delta.SnapshotManagement.$anonfun$getSnapshotAt$1(SnapshotManagement.scala:568)
at scala.Option.orElse(Option.scala:447)

@0xdarkman
Copy link
Author

Additionaly, job 1 sees "com.microsoft.azure.storage.StorageException: The specified blob does not exist." also.

org.apache.spark.SparkException: Task failed while writing rows.
at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:500)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:321)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$16(FileFormatWriter.scala:229)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: org.apache.hadoop.fs.azure.AzureException: com.microsoft.azure.storage.StorageException: The specified blob does not exist.
at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.rename(AzureNativeFileSystemStore.java:2849)
at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.rename(AzureNativeFileSystemStore.java:2715)
at org.apache.hadoop.fs.azure.NativeAzureFileSystem$NativeAzureFsOutputStream.restoreKey(NativeAzureFileSystem.java:1203)
at org.apache.hadoop.fs.azure.NativeAzureFileSystem$NativeAzureFsOutputStream.close(NativeAzureFileSystem.java:1069)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:77)
at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at org.apache.parquet.hadoop.util.HadoopPositionOutputStream.close(HadoopPositionOutputStream.java:64)
at org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:1106)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:132)
at org.apache.parquet.hadoop.ParquetRecordWriter.close(ParquetRecordWriter.java:164)
at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.close(ParquetOutputWriter.scala:41)
at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.releaseCurrentWriter(FileFormatDataWriter.scala:64)
at org.apache.spark.sql.execution.datasources.BaseDynamicPartitionDataWriter.renewCurrentWriter(FileFormatDataWriter.scala:266)
at org.apache.spark.sql.execution.datasources.DynamicPartitionDataSingleWriter.write(FileFormatDataWriter.scala:357)
at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.writeWithMetrics(FileFormatDataWriter.scala:85)
at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.writeWithIterator(FileFormatDataWriter.scala:92)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:304)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1496)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:311)
... 9 more
Caused by: com.microsoft.azure.storage.StorageException: The specified blob does not exist.
at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:87)
at com.microsoft.azure.storage.core.StorageRequest.materializeException(StorageRequest.java:305)
at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:196)
at com.microsoft.azure.storage.blob.CloudBlob.delete(CloudBlob.java:1054)
at org.apache.hadoop.fs.azure.StorageInterfaceImpl$CloudBlobWrapperImpl.delete(StorageInterfaceImpl.java:314)
at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.safeDelete(AzureNativeFileSystemStore.java:2623)
at org.apache.hadoop.fs.azure.AzureNativeFileSystemStore.rename(AzureNativeFileSystemStore.java:2819)
... 27 more

@0xdarkman
Copy link
Author

I even changed schema for reading from wasbs to abfss and assigned public permisions on the container but still I am getting this:

"Caused by: Operation failed: "The specified path does not exist.", 404, GET, https://raddsstatsstorage.dfs.core.windows.net/stats-prod?upn=false&resource=filesystem&maxResults=500&directory=10m/per_ip/_delta_log&timeout=90&recursive=false, PathNotFound, "The specified path does not exist. RequestId:e0249403-c01f-0006-1eef-99b549000000 Time:2022-07-17T15:12:17.9805557Z""

@jeffco11
Copy link

I'm seeing the same issue. Our scenario is:

  1. Job 1 streams data from EventHub to DeltaTable
  2. Job 2 streams data from DeltaTable to another DeltaTable.

Job 2 will see these errors at least once a day

@scottsand-db scottsand-db self-assigned this Jul 26, 2022
@zsxwing
Copy link
Member

zsxwing commented Aug 1, 2022

@0xdarkman @jeffco11 Looks like you can see the request id in the error. Would you be able to reach Azure to check these requests? This is unlikely a Delta issue as the error complains the _delta_log directory doesn't exist.

@0xdarkman
Copy link
Author

@jeffco11 when you write stream using abfss schema do you create container yourself upfront?

@0xdarkman
Copy link
Author

@0xdarkman @jeffco11 Looks like you can see the request id in the error. Would you be able to reach Azure to check these requests? This is unlikely a Delta issue as the error complains the _delta_log directory doesn't exist.

_delta_log should be there, should not?

@0xdarkman
Copy link
Author

0xdarkman commented Aug 1, 2022

@0xdarkman @jeffco11 Looks like you can see the request id in the error. Would you be able to reach Azure to check these requests? This is unlikely a Delta issue as the error complains the _delta_log directory doesn't exist.

this is the reponse from Azure:

"I used a random time to check for similar failed requests and there were thousands of such failed requests below are sample URLs"

"Since the file does not exists this is an 'expected' client error, but without knowing the access pattern of this storage client application we can't say for sure if the client is just checking for the existence of the file or if the client application expected the file to exist before it called GetFileProperties. What’s known is that 4xx are client errors and does not indicate any issue with the service"

"You might want to look at your Spark job to see if this is expected or if you can tweak the job such that it doesn't have to generate these 404s on the GetFileSystem properties applications."

"Also, are the jobs successful? If they are, this is more indication that the GetFileProperties 404 errors might be expected."

I am certian I should not be managing existence of _delta_log at all.

@zsxwing
Copy link
Member

zsxwing commented Aug 1, 2022

@0xdarkman I assume _deltalog does exist in your case but Azure storage somehow returns 404. Is my understanding of your issue correct?

@jeffco11
Copy link

jeffco11 commented Aug 1, 2022

I got a similar response from the Azure team. If you look for a file that doesn't exist, you should expect a 404 error. We don't manage the _deltalog so I'm not sure what's happening.

In our scenario the _deltalog always exists because we are streaming data to the table constantly and we occasionally get the 404's when we try to read from that table.

We created the container upfront originally and have been writing to this table for well over a year.

@0xdarkman
Copy link
Author

@0xdarkman I assume _deltalog does exist in your case but Azure storage somehow returns 404. Is my understanding of your issue correct?

correct, _delta_log exists.

I ensure I write some data first before I start reading from the path. Furthermore, _delta_log is created and managed by delta table.

#1277 (comment) summarizes the problem well.

As soon as I switch to gen1 (change storage account, abfss -> wasbs, dfs -> blob) then all works well.

@zsxwing
Copy link
Member

zsxwing commented Aug 2, 2022

Yep, this looks like either a hadoop-azure library issue or the server issue. It sounds more likely a server issue because the request url seems correct to me.

Delta just calls FileSystems provided by hadoop-azure. It's unlikely that the issue can be fixed inside Delta Lake.

@0xdarkman
Copy link
Author

Yep, this looks like either a hadoop-azure library issue or the server issue. It sounds more likely a server issue because the request url seems correct to me.

Delta just calls FileSystems provided by hadoop-azure. It's unlikely that the issue can be fixed inside Delta Lake.

What do you call "server" in this case?

The spark application runs on kubernetes in Azure and writes to Azure storage gen 2 Delta Table.

@zsxwing
Copy link
Member

zsxwing commented Aug 2, 2022

What do you call "server" in this case?

Azure storage server. Hadoop-azure basically just calls azure storage client sdk to send http requests to Azure storage server (I don't know how Azure calls it though).

@zsxwing zsxwing added question Questions on how to use Delta Lake and removed bug Something isn't working labels Aug 2, 2022
@sebbegg
Copy link

sebbegg commented Oct 19, 2022

FWIW: We are seeing the same here with _delta_log replaced by _spark_metadata, when consuming data from spark streaming job that writes to "plain" parquet.
(Also on Azure datalake gen2)

@0xdarkman
Copy link
Author

It is resolved. MS implemented bug fix few days ago.

@zsxwing
Copy link
Member

zsxwing commented Oct 19, 2022

Thanks for the update. I will close this issue.

@zsxwing zsxwing closed this as completed Oct 19, 2022
@rahul24
Copy link

rahul24 commented Nov 5, 2022

I'm still facing this issue. Seems the fix is not rolled out yet.

@0xdarkman - Did they fix the issue on storage or spark layer?

@KavithaPonjagannath
Copy link

It is resolved. MS implemented bug fix few days ago.

what is the resolution

@nnanto
Copy link

nnanto commented Nov 28, 2022

It is resolved. MS implemented bug fix few days ago.

@0xdarkman Could you let us know where the fix was made?

@li-plaintext
Copy link

It is resolved. MS implemented bug fix few days ago.

Same here. Where could I find the fix? Many thanks.

@zsxwing
Copy link
Member

zsxwing commented Jan 4, 2023

@li-plaintext would you or anyone be able to contact MSFT support to look at this issue?

@0xdarkman
Copy link
Author

0xdarkman commented Jan 11, 2023

I do confirm the issue is not resolved.
The problem reoccured at my end after couple of weeks of constant run.
Please reopen the bug.
I have also just reached to Microsoft.

@blindbox2
Copy link

blindbox2 commented Feb 17, 2023

I do confirm the issue is not resolved. The problem reoccured at my end after couple of weeks of constant run. Please reopen the bug. I have also just reached to Microsoft.

@0xdarkman did you manage to solve this with the help of Microsoft? We are currently facing the exact same issue.

@0xdarkman
Copy link
Author

0xdarkman commented Feb 17, 2023

They said to me that they applied the fix on the storage account but then after couple of weeks they rolled out the change back because of the tenant change. So simply saying they did not fix it.

Now I am waiting for the fix to be rolled out to all my storage accounts but still they have not fixed it yet and are not telling me what is the exact issue. So in short lots of nonse with no good explanation.

@blindbox2
Copy link

Thanks for the quick reply! That sucks though... but according to them it was something to be fixed on the storage account level if I understand correctly and not something that is fixable by ourselves. I guess there will be no other option then to open a ticket with them.
Did you find something of a workaround in the meantime or did you just stop using merges for now?

@0xdarkman
Copy link
Author

No, i did not find a solution. I used gen1 instead as intermediate storage between two streams.

@minnieshi
Copy link

minnieshi commented Apr 26, 2023

Hi 0xdarkman
Could you please share the bug report that was raised to MS? I am having the same issues, though not using streaming.

We use DF.write.format(delta).save(dataPath)
And it has
ERROR AzureBlobFileSystemStore: Received exception while listing a directory. Operation failed: "The specified path does not exist.", 404, GET,

The stack trace show it is triggered when the save is called

@blindbox2
Copy link

Hi 0xdarkman Could you please share the bug report that was raised to MS? I am having the same issues, though not using streaming.

We use DF.write.format(delta).save(dataPath) And it has ERROR AzureBlobFileSystemStore: Received exception while listing a directory. Operation failed: "The specified path does not exist.", 404, GET,

The stack trace show it is triggered when the save is called

In our case what happend was that the container that synapse used in the datalake for its internal storage and compute operations had accidently been deleted. By restoring this container, in our case synws, the issue was resolved. Maybe this might help you out as well.

@kpsingh05
Copy link

Hi @0xdarkman Could you please share what was the resolution for it.
I am facing the same error; yarn logs show the error, but job succeeds. But When I verify on storage delta lake, data is not present.

Caused by: java.io.FileNotFoundException: Operation failed: "The specified path does not exist.", 404, PUT, https://redactedt/conatinerName/_temporary/0/_temporary/attempt_20231205133137_0183_m_000190_28900/Partition%3D19/TimeBucket%3D2023-12-02%2000%253A00%253A00/JobId%3D95ae4077-c42f-4fa4-9196-84166f638ccd/redacted/redacted/part-00190-f17dd298-a5a1-495e-9d10-3ec56650d0e6.c000.snappy.parquet?action=flush&retainUncommittedData=false&position=18418&close=true&timeout=90, PathNotFound, "The specified path does not exist. RequestId:87cdb709-c01f-0006-137f-277125000000 Time:2023-12-05T13:33:07.5183221Z"

@0xdarkman
Copy link
Author

0xdarkman commented Dec 14, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Questions on how to use Delta Lake
Projects
None yet
Development

No branches or pull requests