Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark NLP Configuration's spark.jsl.settings.storage.cluster_tmp_dir: Databricks DBFS location does not work #14129

Closed
1 task done
jiamaozheng opened this issue Jan 9, 2024 · 3 comments · Fixed by #14132

Comments

@jiamaozheng
Copy link
Contributor

jiamaozheng commented Jan 9, 2024

Is there an existing issue for this?

  • I have searched the existing issues and did not find a match.

Who can help?

Please help

What are you working on?

Databricks 9.1 LTS ML (includes Apache Spark 3.1.2, Scala 2.12)
com.johnsnowlabs.nlp:spark-nlp_2.12:5.2.2

Current Behavior

Databricks DBFS location configured in Spark NLP Configuration's spark.jsl.settings.storage.cluster_tmp_dir is not recognized. Instead it returns an incorrect location with the prefix of nvirginia-prod/423079709230XXXX/ such asnvirginia-prod/423079709230XXXX/dbfs:/mnt/audix-prod-1-ephemeral/tmp/personalization_ml/spark_nlp/6ade498ff4bf_cdx/EMBEDDINGS_glove_100d/.

Expected Behavior

Documentation -
The location to use on a cluster for temporarily files such as unpacking indexes for WordEmbeddings. By default, this locations is the location of hadoop.tmp.dir set via Hadoop configuration for Apache Spark. NOTE: S3 is not supported and it must be local, HDFS, or DBFS

We expected that temporary files should be written to the correct Databricks DBFS path (dbfs:/PATH_TO_STORAGE)

Steps To Reproduce

from sparknlp.annotator import *
from pyspark.sql import SparkSession 

tmp_dir = 'dbfs:/mnt/audix-prod-1-ephemeral/tmp/personalization_ml/spark_nlp/'

spark = SparkSession.builder \
    .appName("Spark NLP") \
    .master("local[*]") \
    .config("spark.driver.memory", "16G") \ 
    .config("spark.driver.maxResultSize", "0") \ 
    .config("spark.kryoserializer.buffer.max", "2000M") \
    .config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.2.2") \
    .config("spark.jsl.settings.storage.cluster_tmp_dir", tmp_dir) \
    .getOrCreate()

model_path = "dbfs:/FileStore/nlp_pretrained_models/glove_100d"

glove = (
    WordEmbeddingsModel.load(model_path)
    .setInputCols(["document", "clean_normal"])
    .setOutputCol("embeddings")
)

Spark NLP version and Apache Spark

Databricks 9.1 LTS ML (includes Apache Spark 3.1.2, Scala 2.12)
com.johnsnowlabs.nlp:spark-nlp_2.12:5.2.2
spark-nlp==5.2.2

Type of Spark Application

Python Application

Java Version

openjdk version "1.8.0_362" OpenJDK Runtime Environment (Zulu 8.68.0.21-CA-linux64) (build 1.8.0_362-b09) OpenJDK 64-Bit Server VM (Zulu 8.68.0.21-CA-linux64) (build 25.362-b09, mixed mode)

Java Home Directory

/usr/lib/jvm/zulu8-ca-amd64/jre/

Setup and installation

spark-nlp==5.2.2
com.johnsnowlabs.nlp:spark-nlp_2.12:5.2.2

Operating System and Version

No response

Link to your project (if available)

No response

Additional Information

Stack Trace -

java.nio.file.AccessDeniedException: nvirginia-prod/423079709230XXXX/dbfs:/mnt/audix-prod-1-ephemeral/tmp/personalization_ml/spark_nlp/6ade498ff4bf_cdx/EMBEDDINGS_glove_100d/: PUT 0-byte object  on nvirginia-prod/423079709230XXXX/dbfs:/mnt/audix-prod-1-ephemeral/tmp/personalization_ml/spark_nlp/6ade498ff4bf_cdx/EMBEDDINGS_glove_100d/: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied; request: PUT https://audix-prod-root.s3-fips.us-east-1.amazonaws.com nvirginia-prod/423079709230XXXX/dbfs%3A/mnt/audix-prod-1-ephemeral/tmp/personalization_ml/spark_nlp/6ade498ff4bf_cdx/EMBEDDINGS_glove_100d/ {} Hadoop 2.7.4, aws-sdk-java/1.11.678 Linux/5.4.0-1116-aws-fips OpenJDK_64-Bit_Server_VM/25.362-b09 java/1.8.0_362 scala/2.12.10 vendor/Azul_Systems,_Inc. com.amazonaws.services.s3.model.PutObjectRequest; Request ID: 6H4JNNPF9CYXGZDC, Extended Request ID: pRZVB79TYuCzJPwyxHZKcAzSWeK8KxDKyla8U1/0qUhDrXjHeVB1rzhuJqmyeqMWZyuxlUs5h14=, Cloud Provider: AWS, Instance ID: i-00f9a7585d7c77bc9 (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 6H4JNNPF9CYXGZDC; S3 Extended Request ID: pRZVB79TYuCzJPwyxHZKcAzSWeK8KxDKyla8U1/0qUhDrXjHeVB1rzhuJqmyeqMWZyuxlUs5h14=), S3 Extended Request ID: pRZVB79TYuCzJPwyxHZKcAzSWeK8KxDKyla8U1/0qUhDrXjHeVB1rzhuJqmyeqMWZyuxlUs5h14=:AccessDenied
---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<command-1042743005935531> in <module>
     29 
     30 glove = (
---> 31     WordEmbeddingsModel.load(model_path)
     32     .setInputCols(["document", "clean_normal"])
     33     .setOutputCol("embeddings")

/databricks/spark/python/pyspark/ml/util.py in load(cls, path)
    461     def load(cls, path):
    462         """Reads an ML instance from the input path, a shortcut of `read().load(path)`."""
--> 463         return cls.read().load(path)
    464 
    465 

/databricks/spark/python/pyspark/ml/util.py in load(self, path)
    411         if not isinstance(path, str):
    412             raise TypeError("path should be a string, got type %s" % type(path))
--> 413         java_obj = self._jread.load(path)
    414         if not hasattr(self._clazz, "_from_java"):
    415             raise NotImplementedError("This Java ML type cannot be loaded into Python currently: %r"

/databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py in __call__(self, *args)
   1302 
   1303         answer = self.gateway_client.send_command(command)
-> 1304         return_value = get_return_value(
   1305             answer, self.gateway_client, self.target_id, self.name)
   1306 

/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
    115     def deco(*a, **kw):
    116         try:
--> 117             return f(*a, **kw)
    118         except py4j.protocol.Py4JJavaError as e:
    119             converted = convert_exception(e.java_exception)

/databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    324             value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
    325             if answer[1] == REFERENCE_TYPE:
--> 326                 raise Py4JJavaError(
    327                     "An error occurred while calling {0}{1}{2}.\n".
    328                     format(target_id, ".", name), value)

Py4JJavaError: An error occurred while calling o417.load.
: java.nio.file.AccessDeniedException: nvirginia-prod/423079709230XXXX/dbfs:/mnt/audix-prod-1-ephemeral/tmp/personalization_ml/spark_nlp/6ade498ff4bf_cdx/EMBEDDINGS_glove_100d/: PUT 0-byte object  on nvirginia-prod/423079709230XXXX/dbfs:/mnt/audix-prod-1-ephemeral/tmp/personalization_ml/spark_nlp/6ade498ff4bf_cdx/EMBEDDINGS_glove_100d/: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied; request: PUT https://audix-prod-root.s3-fips.us-east-1.amazonaws.com nvirginia-prod/423079709230XXXX/dbfs%3A/mnt/audix-prod-1-ephemeral/tmp/personalization_ml/spark_nlp/6ade498ff4bf_cdx/EMBEDDINGS_glove_100d/ {} Hadoop 2.7.4, aws-sdk-java/1.11.678 Linux/5.4.0-1116-aws-fips OpenJDK_64-Bit_Server_VM/25.362-b09 java/1.8.0_362 scala/2.12.10 vendor/Azul_Systems,_Inc. com.amazonaws.services.s3.model.PutObjectRequest; Request ID: 6H4JNNPF9CYXGZDC, Extended Request ID: pRZVB79TYuCzJPwyxHZKcAzSWeK8KxDKyla8U1/0qUhDrXjHeVB1rzhuJqmyeqMWZyuxlUs5h14=, Cloud Provider: AWS, Instance ID: i-00f9a7585d7c77bc9 (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 6H4JNNPF9CYXGZDC; S3 Extended Request ID: pRZVB79TYuCzJPwyxHZKcAzSWeK8KxDKyla8U1/0qUhDrXjHeVB1rzhuJqmyeqMWZyuxlUs5h14=), S3 Extended Request ID: pRZVB79TYuCzJPwyxHZKcAzSWeK8KxDKyla8U1/0qUhDrXjHeVB1rzhuJqmyeqMWZyuxlUs5h14=:AccessDenied
	at shaded.databricks.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:248)
	at shaded.databricks.org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:120)
	at shaded.databricks.org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:274)
	at shaded.databricks.org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:333)
	at shaded.databricks.org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:270)
	at shaded.databricks.org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:245)
	at shaded.databricks.org.apache.hadoop.fs.s3a.S3AFileSystem.createEmptyObject(S3AFileSystem.java:3881)
	at shaded.databricks.org.apache.hadoop.fs.s3a.S3AFileSystem.createFakeDirectory(S3AFileSystem.java:3853)
	at shaded.databricks.org.apache.hadoop.fs.s3a.S3AFileSystem.innerMkdirs(S3AFileSystem.java:3155)
	at shaded.databricks.org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:3088)
	at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.$anonfun$mkdirs$3(DatabricksFileSystemV2.scala:820)
	at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
	at com.databricks.s3a.S3AExceptionUtils$.convertAWSExceptionToJavaIOException(DatabricksStreamUtils.scala:66)
	at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.$anonfun$mkdirs$2(DatabricksFileSystemV2.scala:818)
	at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
	at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.$anonfun$withUserContextRecorded$2(DatabricksFileSystemV2.scala:1013)
	at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:266)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:261)
	at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:258)
	at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.withAttributionContext(DatabricksFileSystemV2.scala:510)
	at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:305)
	at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:297)
	at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.withAttributionTags(DatabricksFileSystemV2.scala:510)
	at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.withUserContextRecorded(DatabricksFileSystemV2.scala:986)
	at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.$anonfun$mkdirs$1(DatabricksFileSystemV2.scala:817)
	at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
	at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:395)
	at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:484)
	at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:504)
	at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:266)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:261)
	at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:258)
	at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.withAttributionContext(DatabricksFileSystemV2.scala:510)
	at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:305)
	at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:297)
	at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.withAttributionTags(DatabricksFileSystemV2.scala:510)
	at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:479)
	at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:404)
	at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.recordOperationWithResultTags(DatabricksFileSystemV2.scala:510)
	at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:395)
	at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:367)
	at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.recordOperation(DatabricksFileSystemV2.scala:510)
	at com.databricks.backend.daemon.data.client.DatabricksFileSystemV2.mkdirs(DatabricksFileSystemV2.scala:817)
	at com.databricks.backend.daemon.data.client.DatabricksFileSystem.mkdirs(DatabricksFileSystem.scala:198)
	at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1881)
	at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:351)
	at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
	at com.johnsnowlabs.storage.StorageHelper$.copyIndexToCluster(StorageHelper.scala:104)
	at com.johnsnowlabs.storage.StorageHelper$.sendToCluster(StorageHelper.scala:91)
	at com.johnsnowlabs.storage.StorageHelper$.load(StorageHelper.scala:50)
	at com.johnsnowlabs.storage.HasStorageModel.$anonfun$deserializeStorage$1(HasStorageModel.scala:43)
	at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
	at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
	at com.johnsnowlabs.storage.HasStorageModel.deserializeStorage(HasStorageModel.scala:42)
	at com.johnsnowlabs.storage.HasStorageModel.deserializeStorage$(HasStorageModel.scala:40)
	at com.johnsnowlabs.nlp.embeddings.WordEmbeddingsModel.deserializeStorage(WordEmbeddingsModel.scala:146)
	at com.johnsnowlabs.storage.StorageReadable.readStorage(StorageReadable.scala:34)
	at com.johnsnowlabs.storage.StorageReadable.readStorage$(StorageReadable.scala:33)
	at com.johnsnowlabs.nlp.embeddings.WordEmbeddingsModel$.readStorage(WordEmbeddingsModel.scala:303)
	at com.johnsnowlabs.storage.StorageReadable.$anonfun$$init$$1(StorageReadable.scala:37)
	at com.johnsnowlabs.storage.StorageReadable.$anonfun$$init$$1$adapted(StorageReadable.scala:37)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.$anonfun$onRead$1(ParamsAndFeaturesReadable.scala:50)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.$anonfun$onRead$1$adapted(ParamsAndFeaturesReadable.scala:49)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.onRead(ParamsAndFeaturesReadable.scala:49)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.$anonfun$read$1(ParamsAndFeaturesReadable.scala:61)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable.$anonfun$read$1$adapted(ParamsAndFeaturesReadable.scala:61)
	at com.johnsnowlabs.nlp.FeaturesReader.load(ParamsAndFeaturesReadable.scala:38)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
	at py4j.Gateway.invoke(Gateway.java:295)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:251)
	at java.lang.Thread.run(Thread.java:750)
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied; request: PUT https://audix-prod-root.s3-fips.us-east-1.amazonaws.com nvirginia-prod/423079709230XXXX/dbfs%3A/mnt/audix-prod-1-ephemeral/tmp/personalization_ml/spark_nlp/6ade498ff4bf_cdx/EMBEDDINGS_glove_100d/ {} Hadoop 2.7.4, aws-sdk-java/1.11.678 Linux/5.4.0-1116-aws-fips OpenJDK_64-Bit_Server_VM/25.362-b09 java/1.8.0_362 scala/2.12.10 vendor/Azul_Systems,_Inc. com.amazonaws.services.s3.model.PutObjectRequest; Request ID: 6H4JNNPF9CYXGZDC, Extended Request ID: pRZVB79TYuCzJPwyxHZKcAzSWeK8KxDKyla8U1/0qUhDrXjHeVB1rzhuJqmyeqMWZyuxlUs5h14=, Cloud Provider: AWS, Instance ID: i-00f9a7585d7c77bc9 (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 6H4JNNPF9CYXGZDC; S3 Extended Request ID: pRZVB79TYuCzJPwyxHZKcAzSWeK8KxDKyla8U1/0qUhDrXjHeVB1rzhuJqmyeqMWZyuxlUs5h14=), S3 Extended Request ID: pRZVB79TYuCzJPwyxHZKcAzSWeK8KxDKyla8U1/0qUhDrXjHeVB1rzhuJqmyeqMWZyuxlUs5h14=
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1712)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1367)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4926)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4872)
	at com.amazonaws.services.s3.AmazonS3Client.access$300(AmazonS3Client.java:390)
	at com.amazonaws.services.s3.AmazonS3Client$PutObjectStrategy.invokeServiceCall(AmazonS3Client.java:5806)
	at com.amazonaws.services.s3.AmazonS3Client.uploadObject(AmazonS3Client.java:1794)
	at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1754)
	at shaded.databricks.org.apache.hadoop.fs.s3a.EnforcingDatabricksS3Client.putObject(EnforcingDatabricksS3Client.scala:69)
	at shaded.databricks.org.apache.hadoop.fs.s3a.S3AFileSystem.putObjectDirect(S3AFileSystem.java:2064)
	at shaded.databricks.org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$createEmptyObject$15(S3AFileSystem.java:3883)
	at shaded.databricks.org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:118)
	... 82 more
@jiamaozheng jiamaozheng changed the title Spark NLP Configuration's spark.jsl.settings.storage.cluster_tmp_dir: dbfs location such as "dbfs:/writeable_path_for_my_user/" does not work - Databricks Spark NLP Configuration's spark.jsl.settings.storage.cluster_tmp_dir: Databricks DBFS location does not work Jan 9, 2024
@maziyarpanahi
Copy link
Member

maziyarpanahi commented Jan 10, 2024

Spark NLP 3.4.4 is an extremely old release. Could you please use 5.2.2 release? Please follow these steps to be sure you are on the latest version correctly https://github.com/JohnSnowLabs/spark-nlp#databricks-cluster

PS: you must have write permission on tmp_dir path via Spark natively or else it fails with permission denied error.

@jiamaozheng
Copy link
Contributor Author

jiamaozheng commented Jan 11, 2024

@maziyarpanahi, thanks for your prompt response and I have updated to use latest release. It seems that the root cause of the issue is the extra file system url prefix and this has been tested/verified via Databricks notebook. I will submit this PR for the fix. Please let me know if you have any thoughts. Thanks,

@maziyarpanahi
Copy link
Member

This should be resolved in 5.3.0 release. Thank you @jiamaozheng for your valuable contribution 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants