TGSAI · tasansal · Jun 18, 2024 · Jun 18, 2024 · Jun 18, 2024 · Jun 18, 2024
diff --git a/README.md b/README.md
@@ -79,6 +79,10 @@ It supports reading from local and cloud files (object store). It can read:
 - Disjoint sequential regions (fast)
 - Random traces (slow)
 
+The library will also try to infer the endianness and the revision of the SEG-Y
+file automatically. If it can't, users can override the endianness, revision, and
+more parameters using the settings.
+
 ### High Performance
 
 The performance is high and to be proven with upcoming benchmarks. The initial
@@ -91,13 +95,13 @@ data models and JSON schema parsing and validation.
 
 ### Predefined SEG-Y Standards
 
-It supports predefined SEG-Y "standards" for various versions. However,
-some versions are still in progress:
+It supports predefined SEG-Y "standards" for various versions. However, some versions
+are still in progress and not all validation logic is implemented yet:
 
-- [x] Rev 0 (1975)
-- [x] Rev 1 (2002)
-- [ ] Rev 2 (2017)
-- [ ] Rev 2.1 (2023)
+- ✅ Rev 0 (1975)
+- ✅ Rev 1 (2002)
+- ✅ Rev 2 (2017)
+- 🔲 Rev 2.1 (2023)
 
 ### Custom SEG-Y Standards
 

diff --git a/docs/api_reference.md b/docs/api_reference.md
@@ -21,26 +21,11 @@
 ## Configuration
 
 ```{eval-rst}
-.. autopydantic_settings:: segy.config.SegyFileSettings
+.. autopydantic_settings:: segy.config.SegySettings
     :inherited-members: BaseModel
 ```
 
 ```{eval-rst}
-.. autopydantic_settings:: segy.config.SegyBinaryHeaderSettings
-    :inherited-members: BaseModel
-```
-
-```{eval-rst}
-.. autopydantic_settings:: segy.config.ExtendedTextHeaderSetting
-    :inherited-members: BaseModel
-```
-
-```{eval-rst}
-.. autopydantic_settings:: segy.config.SampleIntervalSetting
-    :inherited-members: BaseModel
-```
-
-```{eval-rst}
-.. autopydantic_settings:: segy.config.SamplesPerTraceSetting
+.. autopydantic_settings:: segy.config.BinaryHeaderSettings
     :inherited-members: BaseModel
 ```
diff --git a/docs/cli_usage.md b/docs/cli_usage.md
@@ -187,8 +187,8 @@ trace_index
 ## Configuration Options
 
 When accessing public datasets from S3, we need to set
-`SegyFileSettings().storage_options = {"anon": True}`{l=python} for anonymous
-access. [SegyFileSettings](#SegyFileSettings) exposes all configuration options
+`SegySettings().storage_options = {"anon": True}`{l=python} for anonymous
+access. [SegySettings](#SegySettings) exposes all configuration options
 as environment variables. We just need to set `storage_options` with the `JSON`
 string `{"anon": true}`{l=python}. On Linux you can do this by the command below.
 Environment variables can be configured in many ways, please refer to the options

diff --git a/docs/data_models/data_type.md b/docs/data_models/data_type.md
@@ -22,7 +22,6 @@
    :nosignatures:
 
    ScalarType
-   DataFormat
    HeaderSpec
    HeaderField
    Endianness

diff --git a/docs/data_models/file.md b/docs/data_models/file.md
@@ -50,13 +50,13 @@ It must be set to one of the allowed [`SegyStandard`](#SegyStandard) values.
 
 #### Text File Header
 
-The [`text_file_header`](#SegySpec.text_file_header) stores the information
-required to parse the textual file header of the SEG-Y file. This includes important
-metadata that pertains to the seismic data in human-readable format.
+The [`text_header`](#SegySpec.text_header) stores the information required to parse
+the textual file header of the SEG-Y file. This includes important metadata that
+pertains to the seismic data in human-readable format.
 
 #### Binary File Header
 
-The [`binary_file_header`](#SegySpec.binary_file_header) item talks about
+The [`binary_header`](#SegySpec.binary_header) item talks about
 the binary file header of the SEG-Y file. It is a set of structured and important
 information about the data in the file, stored in binary format for machines to
 read and process quickly and efficiently.

diff --git a/docs/settings.md b/docs/settings.md
@@ -15,43 +15,39 @@
 :class-container: sd-p-0 sd-outline-muted sd-rounded-3 sd-font-weight-light
 ```
 
-## `SegyFileSettings` Class
+## `SegySettings` Class
 
-The [SegyFileSettings] is a configuration object for the
+The [SegySettings] is a configuration object for the
 [SegyFile] in the environment. It allows you to customize various aspects of
 SEG-Y file parsing according to your needs and the specifics of your project.
 
 It is composed of various sub-settings isolated by SEG-Y components and various topics.
 
-- **binary**: The [SegyBinaryHeaderSettings] is used for binary header configuration
-  while reading a SEG-Y file.
-- **endian**: This setting determines the byte order that is being used in the SEG-Y file.
+- **binary**: The [BinaryHeaderSettings] is used for binary header overrides
+  when reading a SEG-Y file.
+- **endianness**: This setting determines the byte order that is being used in the SEG-Y file.
   The possible options are `"big"` or `"little"` based on [Endianness]. If left as None,
   the system defaults to Big Endian (`"big"`).
 - **revision**: This setting is used to specify the SEG-Y revision number. If left as
   None, the system will automatically use the revision mentioned in the SEG-Y file.
-- **use_pandas**: This setting is a boolean that decides whether to use pandas for
-  headers or not. Does not apply to trace data. The trace data is always returned
-  as Numpy arrays. The option to use Numpy for headers is currently disabled and will
-  be available at a later release (as of March 2024).
+- **storage_options**: Provides a hook to pass parameters to storage backend. Like
+  credentials, anonymous access, etc.
 
 ## Usage
 
-You initialize an instance of [SegyFileSettings] like any other Python object,
+You initialize an instance of [SegySettings] like any other Python object,
 optionally providing initial values for the settings. For example:
 
 ```python
-from segy.config import SegyBinaryHeaderSettings
+from segy.config import BinaryHeaderSettings
 from segy.config import SegySettings
 from segy.schema import Endianness
 
 # Override extended text header count to zero
-binary_header_settings = SegyBinaryHeaderSettings(
-    extended_text_header={"value": 0}
-)
+bin_overrides = BinaryHeaderSettings(extended_text_header=0)
 
 settings = SegySettings(
-    binary=binary_header_settings,
+    binary=bin_overrides,
     endian=Endianness.LITTLE,
     revision=1,
 )
@@ -68,25 +64,24 @@ file = SegyFile(uri="...", settings=settings)
 If no settings are provided to [SegyFile], it will take the default values.
 
 ```{seealso}
-[SegyFileSettings], [SegyFile], [Endianness]
+[SegySettings], [SegyFile], [Endianness]
 ```
 
 ## Environment Variables
 
 Environment variables that follow the `SEGY__VARIABLE__SUBVARIABLE` format will be
-automatically included in your [SegyFileSettings] instance:
+automatically included in your [SegySettings] instance:
 
 ```shell
-export SEGY__BINARY__SAMPLES_PER_TRACE__VALUE=1001
-export SEGY__BINARY__SAMPLE_INTERVAL__KEY="my_custom_key_in_schema"
-export SEGY__ENDIAN="big"
-export SEGY__REVISION=0.0
+export SEGY__BINARY__SAMPLES_PER_TRACE=1001
+export SEGY__ENDIANNESS="big"
+export SEGY__REVISION=0
 ```
 
-The environment variables will override the defaults in the [SegyFileSettings]
+The environment variables will override the defaults in the [SegySettings]
 configuration, unless user overrides it again within Python.
 
 [endianness]: #Endianness
-[segyfilesettings]: #SegyFileSettings
+[segysettings]: #SegySettings
 [segyfile]: #SegyFile
-[segybinaryheadersettings]: #SegyBinaryHeaderSettings
+[segybinaryheadersettings]: #BinaryHeaderSettings
diff --git a/docs/tutorials/creation.ipynb b/docs/tutorials/creation.ipynb
@@ -27,7 +27,7 @@
    "outputs": [],
    "source": [
     "from segy.factory import SegyFactory\n",
-    "from segy.standards.rev1 import rev1_segy"
+    "from segy.standards import get_segy_standard"
    ]
   },
   {
@@ -49,12 +49,13 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "SAMPLE_INTERVAL = 4000  # in microseconds\n",
-    "SAMPLES_PER_TRACE = 101\n",
+    "factory_config = {\n",
+    "    \"spec\": get_segy_standard(1.0),\n",
+    "    \"samples_per_trace\": 101,\n",
+    "    \"sample_interval\": 4_000,  # in microseconds\n",
+    "}\n",
     "\n",
-    "factory = SegyFactory(\n",
-    "    rev1_segy, sample_interval=SAMPLE_INTERVAL, samples_per_trace=SAMPLES_PER_TRACE\n",
-    ")\n",
+    "factory = SegyFactory(**factory_config)\n",
     "\n",
     "txt = factory.create_textual_header()\n",
     "bin_ = factory.create_binary_header()"
@@ -83,13 +84,13 @@
     "samples = factory.create_trace_sample_template(size=TRACE_COUNT)\n",
     "\n",
     "for trace_idx in range(TRACE_COUNT):\n",
-    "    headers[trace_idx][\"trace_seq_file\"] = trace_idx + 1\n",
+    "    headers[trace_idx][\"trace_seq_num_reel\"] = trace_idx + 1\n",
     "    headers[trace_idx][\"cdp_x\"] = 1_000\n",
     "    headers[trace_idx][\"cdp_y\"] = 10_000 + trace_idx * 50\n",
     "    headers[trace_idx][\"inline\"] = 10\n",
     "    headers[trace_idx][\"crossline\"] = 100 + trace_idx\n",
     "\n",
-    "    samples[trace_idx] = range(SAMPLES_PER_TRACE)  # sample index\n",
+    "    samples[trace_idx] = range(factory_config[\"samples_per_trace\"])  # sample index\n",
     "    samples[trace_idx] += trace_idx  # trace no"
    ]
   },
@@ -190,7 +191,7 @@
    "outputs": [],
    "source": [
     "show_fields = [\n",
-    "    \"trace_seq_file\",\n",
+    "    \"trace_seq_num_reel\",\n",
     "    \"cdp_x\",\n",
     "    \"cdp_y\",\n",
     "    \"inline\",\n",
@@ -203,7 +204,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "ef613e27-20cf-420c-85d5-cfa630b28def",
+   "id": "96fafaae-894a-447a-adcb-d98ffa70a0ad",
    "metadata": {},
    "outputs": [],
    "source": []
@@ -224,7 +225,8 @@
    "mimetype": "text/x-python",
    "name": "python",
    "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3"
+   "pygments_lexer": "ipython3",
+   "version": "3.12.3"
   }
  },
  "nbformat": 4,

diff --git a/docs/tutorials/quickstart.ipynb b/docs/tutorials/quickstart.ipynb
@@ -48,7 +48,7 @@
     "from segy import SegyFile\n",
     "from segy.config import SegySettings\n",
     "from segy.schema import HeaderField\n",
-    "from segy.standards import rev1_segy"
+    "from segy.standards import get_segy_standard"
    ]
   },
   {
@@ -61,13 +61,13 @@
     "[http link]: http://s3.amazonaws.com/open.source.geoscience/open_data/newzealand/Taranaiki_Basin/PARIHAKA-3D/Parihaka_PSTM_full_angle.sgy\n",
     "\n",
     "This link is convenient as the `segy` library supports HTTP and we can directly use it\n",
-    "without downloading as well. Hovewer, For demonstration purposes, we'll use the \n",
+    "without downloading as well. Hovewer, For demonstration purposes, we'll use the\n",
     "corresponding S3 link (or called bucket and prefix):\n",
     "\n",
     "`s3://open.source.geoscience/open_data/newzealand/Taranaiki_Basin/PARIHAKA-3D/Parihaka_PSTM_full_angle.sgy`\n",
     "\n",
     "It's important to note that the file isn't downloaded but rather read on demand from the\n",
-    "S3 object store with the `segy` library. \n",
+    "S3 object store with the `segy` library.\n",
     "\n",
     "The `SegyFile` class uses information from the binary file header to construct a SEG-Y\n",
     "descriptor, allowing it to read the file. The SEG-Y Revision is inferred from the binary\n",
@@ -202,7 +202,7 @@
    "id": "7ba3bd7911a900ec",
    "metadata": {},
    "source": [
-    "We can look at headers (by default it is a Pandas `DataFrame`) in a nicely formatted table. \n",
+    "We can look at headers (by default it is a Pandas `DataFrame`) in a nicely formatted table.\n",
     "\n",
     "We can also do typical Pandas analytics (like plots, statistics, etc.) but it won't be shown here."
    ]
@@ -281,9 +281,9 @@
     "Based on the text header lines:\n",
     "\n",
     "```\n",
-    "C 2 HEADER BYTE LOCATIONS AND TYPES:                                            \n",
-    "C 3     3D INLINE : 189-192 (4-BYTE INT)    3D CROSSLINE: 193-196 (4-BYTE INT)  \n",
-    "C 4     ENSEMBLE X: 181-184 (4-BYTE INT)    ENSEMBLE Y  : 185-188 (4-BYTE INT)   \n",
+    "C 2 HEADER BYTE LOCATIONS AND TYPES:\n",
+    "C 3     3D INLINE : 189-192 (4-BYTE INT)    3D CROSSLINE: 193-196 (4-BYTE INT)\n",
+    "C 4     ENSEMBLE X: 181-184 (4-BYTE INT)    ENSEMBLE Y  : 185-188 (4-BYTE INT)\n",
     "```\n",
     "\n",
     "As we know by the SEG-Y Rev1 definition, the coordinate scalars are at byte 71."
@@ -296,18 +296,19 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "custom_spec = rev1_segy.customize(\n",
+    "rev1 = get_segy_standard(1.0)\n",
+    "custom_spec = rev1.customize(\n",
     "    binary_header_fields=[\n",
-    "        HeaderField(name=\"sample_int\", byte=17, format=\"int16\"),\n",
-    "        HeaderField(name=\"num_samples\", byte=21, format=\"int16\"),\n",
-    "        HeaderField(name=\"num_ext_text_headers\", byte=305, format=\"int16\"),\n",
+    "        HeaderField(name=\"sample_interval\", byte=17, format=\"int16\"),\n",
+    "        HeaderField(name=\"samples_per_trace\", byte=21, format=\"int16\"),\n",
+    "        HeaderField(name=\"num_extended_text_headers\", byte=305, format=\"int16\"),\n",
     "    ],\n",
     "    trace_header_fields=[\n",
     "        HeaderField(name=\"inline\", byte=189, format=\"int32\"),\n",
     "        HeaderField(name=\"crossline\", byte=193, format=\"int32\"),\n",
     "        HeaderField(name=\"cdp_x\", byte=181, format=\"int32\"),\n",
     "        HeaderField(name=\"cdp_y\", byte=185, format=\"int32\"),\n",
-    "        HeaderField(name=\"scalar_coord\", byte=71, format=\"int16\"),\n",
+    "        HeaderField(name=\"coordinate_scalar\", byte=71, format=\"int16\"),\n",
     "    ],\n",
     ")\n",
     "\n",
@@ -398,8 +399,8 @@
    "source": [
     "trace_headers = traces.header.to_dataframe()\n",
     "\n",
-    "trace_headers[\"cdp_x\"] /= trace_headers[\"scalar_coord\"].abs()\n",
-    "trace_headers[\"cdp_y\"] /= trace_headers[\"scalar_coord\"].abs()\n",
+    "trace_headers[\"cdp_x\"] /= trace_headers[\"coordinate_scalar\"].abs()\n",
+    "trace_headers[\"cdp_y\"] /= trace_headers[\"coordinate_scalar\"].abs()\n",
     "\n",
     "trace_headers"
    ]
@@ -468,7 +469,8 @@
    "mimetype": "text/x-python",
    "name": "python",
    "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3"
+   "pygments_lexer": "ipython3",
+   "version": "3.12.3"
   }
  },
  "nbformat": 4,

diff --git a/poetry.lock b/poetry.lock