spiceai · y-f-u · May 20, 2024 · May 17, 2024 · May 19, 2024 · May 17, 2024
diff --git a/spiceaidocs/docs/data-connectors/ftp.md b/spiceaidocs/docs/data-connectors/ftp.md
@@ -7,72 +7,61 @@ description: 'FTP/SFTP Data Connector Documentation'
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 
-The FTP/SFTP Data Connector enables federated SQL query across files stored in FTP/SFTP servers.
+The FTP/SFTP Data Connector enables federated SQL query across Parquet/CSV files stored in FTP/SFTP servers.
 
-Supports Parquet and CSV file formats.
+If a folder is provided, all child Parquet/CSV files will be loaded.
 
-If a folder is proivided, all child files will be loaded.
-
-To connect to any FTP/SFTP server, specify `ftp` or `sftp` as a selector in the `from` value for the dataset.
+## Configuration
 
 <Tabs>
   <TabItem value="ftp" label="FTP" default>
-  ```yaml
-  datasets:
-    - from: ftp://<host>/path/to/folder/
-      name: my_dataset
-  ```
-  </TabItem>
-  <TabItem value="sftp" label="SFTP">
-  ```yaml
-  datasets:
-    - from: sftp://<host>/path/to/folder/
-      name: my_dataset
-  ```
-  </TabItem>
-</Tabs>
+    ### Parameters
 
-## Configuration
+    The connection to FTP can be configured by providing the following params:
 
-<Tabs>
-  <TabItem value="ftp" label="FTP" default>
-    - `file_format`: Optional parameter, specifies the requested file format.
+    - `file_format`: Optional, specifies the requested file format.
       - `parquet`: (default) Parquet file format.
       - `csv`: CSV file format.
-    - `ftp_port`: Optional parameter, specifies the port of the FTP server. Default is 21. E.g. `ftp_port: 21`
+    - `ftp_port`: Optional, specifies the port of the FTP server. Default is 21. E.g. `ftp_port: 21`
     - `ftp_user`: The username for the FTP server. E.g. `ftp_user: my-ftp-user`
     - `ftp_pass`: The password for the FTP server. E.g. `ftp_pass: my-ftp-password`
     - `ftp_pass_key`: The secret key container the password to connect with. E.g. `ftp_pass_key: my-ftp-password-key`
+
+    More CSV related parameters can be configured, see [CSV Parameters](../reference/file_format.md#CSV)
+
+    ### Examples
+    ```yaml
+      - from: ftp://remote-ftp-server.com/path/to/folder/
+        name: my_dataset
+        params:
+          file_format: csv
+          ftp_user: my-ftp-user
+          ftp_pass_key: my-ftp-password
+    ```
   </TabItem>
   <TabItem value="sftp" label="SFTP">
-    - `file_format`: Optional parameter, specifies the requested file format.
+    ### Parameters
+
+    The connection to SFTP can be configured by providing the following params:
+
+    - `file_format`: Optional, specifies the requested file format.
       - `parquet`: (default) Parquet file format.
       - `csv`: CSV file format.
-    - `sftp_port`: Optional parameter, specifies the port of the SFTP server. Default is 22. E.g. `sftp_port: 22`
-    - `sftp_user`: The username for the FTP server. E.g. `sftp_user: my-sftp-user`
-    - `sftp_pass`: The password for the FTP server. E.g. `sftp_pass: my-sftp-password`
+    - `sftp_port`: Optional, specifies the port of the SFTP server. Default is 22. E.g. `sftp_port: 22`
+    - `sftp_user`: The username for the SFTP server. E.g. `sftp_user: my-sftp-user`
+    - `sftp_pass`: The password for the SFTP server. E.g. `sftp_pass: my-sftp-password`
     - `sftp_pass_key`: The secret key container the password to connect with. E.g. `sftp_pass_key: my-sftp-password-key`
-  </TabItem>
-</Tabs>
-
-Configuration `params` are provided either in the top level `dataset` for a dataset source and federated SQL query.
-
-```yaml
-  - from: ftp://remote-ftp-server.com/path/to/folder/
-    name: my_dataset
-    params:
-      file_format: csv
-      ftp_user: my-ftp-user
-      ftp_pass_key: my-ftp-password
-```
-
-```yaml
-  - from: sftp://remote-ftp-server.com/path/to/folder/
-    name: my_dataset
-    params:
-      sftp_port: 20
-      sftp_user: my-ftp-user
-      sftp_pass_key: my-ftp-password
-```
 
+    More CSV related parameters can be configured, see [CSV Parameters](../reference/file_format.md#CSV)
 
+    ### Examples
+    ```yaml
+      - from: sftp://remote-sftp-server.com/path/to/folder/
+        name: my_dataset
+        params:
+          sftp_port: 20
+          sftp_user: my-sftp-user
+          sftp_pass_key: my-sftp-password
+    ```
+  </TabItem>
+</Tabs>
diff --git a/spiceaidocs/docs/data-connectors/s3.md b/spiceaidocs/docs/data-connectors/s3.md
@@ -7,11 +7,11 @@ description: 'S3 Data Connector Documentation'
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 
-The S3 Data Connector enables federated SQL query across Parquet files stored in S3, or S3-compatible storage solutions (e.g. MinIO, Cloudflare R2).
+The S3 Data Connector enables federated SQL query across Parquet/CSV files stored in S3, or S3-compatible storage solutions (e.g. MinIO, Cloudflare R2).
 
-Support for Iceberg, CSV, and other file-formats are on the roadmap.
+Support for Iceberg and other file-formats are on the roadmap.
 
-If a folder is provided, all child Parquet files will be loaded.
+If a folder is provided, all child Parquet/CSV files will be loaded.
 
 ## Dataset Schema Reference
 
@@ -29,12 +29,12 @@ Example: `name: cool_dataset`
 
 ### `params` (optional)
 
-- `file_format`: Specifies the requested file format. Default is `parquet`.
-  - `parquet`: (default) Parquet file format.
-  - `csv`: CSV file format.
 - `endpoint`: The S3 endpoint, or equivalent (e.g. MinIO endpoint), for the S3-compatible storage. Defaults to region endpoint. E.g. `endpoint: https://my.minio.server`
 - `region`: Region of the S3 bucket, if region specific. Default value is `us-east-1`  E.g. `region: us-east-1`
 - `timeout`: Specifies timeout for S3 operations. Default value is `30s` E.g. `timeout: 60s`
+- `file_format`: Optional. The file format to query against, either `csv` or `parquet`. Defaults to `parquet`.
+
+More CSV related parameters can be configured, see [CSV Parameters](../reference/file_format.md#CSV)
 
 ## Auth
 

diff --git a/spiceaidocs/docs/reference/file_format.md b/spiceaidocs/docs/reference/file_format.md
@@ -0,0 +1,20 @@
+---
+title: "File Formats"
+sidebar_label: "File Formats"
+pagination_prev: 'reference/index'
+pagination_next: null
+---
+
+Spice currently supports CSV and Parquet data file-formats. Support for Iceberg and other file-formats are on the roadmap.
+
+The parameters supported for specific file-format are detailed on this page.
+
+## CSV
+
+### Parameters
+
+- `has_header`: Optional. Indicate if the CSV file has header row. Defaults to `true`
+- `quote`: Optional. A one-character string used to quote fields containing special characters. Defaults to `"`
+- `escape`: Optional. A one-character string used to represent special characters or to include characters that would normally be interpreted as delimiters or new line characters within a field value. Defaults to `null`
+- `schema_infer_max_records`: Optional. A number used to set the limit in terms of records to scan to infer the schema. Defaults to `1000`
+- `delimiter`: Optional. A one-character string used to separate individual fields. Defaults to `,`