Skip to content

Commit

Permalink
removing explode_outer
Browse files Browse the repository at this point in the history
  • Loading branch information
orellabac committed Oct 17, 2023
1 parent 375d74d commit f8bcf49
Show file tree
Hide file tree
Showing 5 changed files with 51 additions and 11 deletions.
7 changes: 6 additions & 1 deletion CHANGE_LOG.txt
Original file line number Diff line number Diff line change
Expand Up @@ -171,4 +171,9 @@ Version 0.0.32

Version 0.0.33
--------------
- Falling back to builtin applyInPandas implementation
- Falling back to builtin applyInPandas implementation

Version 0.0.34
--------------
- explode have been removed from this library as it is supported natively by snowpark.
- updated README providing information on how to use default `connections.toml`
47 changes: 43 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,47 @@ import snowpark_extensions
new_session = Session.builder.env().appName("app1").create()
```


> NOTE: since 1.8.0 the [python connector was updated](https://docs.snowflake.com/en/release-notes/clients-drivers/python-connector-2023#version-3-1-0-july-31-2023) and we provide support for an unified configuration storage for `snowflake-python-connector` and `snowflake-snowpark-python` with this approach.
>
> You can use this connections leveraging `Session.builder.getOrCreate()` or `Session.builder.create()`
>
> By default, we look for the `connections.toml` file in the location specified in the `SNOWFLAKE_HOME` environment variable (default: `~/.snowflake`). If this folder does not exist, the Python connector looks for the file in the `platformdirs` location, as follows:
>
> * On Linux: `~/.config/snowflake/`, but follows XDG settings
> * On Mac: `~/Library/Application Support/snowflake/`
> * On Windows: `%USERPROFILE%\AppData\Local\snowflake\`
>
> The default connection by default is 'default' but it can be controlled with the environment variable: `SNOWFLAKE_DEFAULT_CONNECTION_NAME`.
>
> If you dont want to use a file you can set the file contents thru the `SNOWFLAKE_CONNECTIONS` environment variable.
>
> Connection file looks like:
>
> ```
> [default]
> accountname = "myaccount"
> username = "user1"
> password = 'xxxxx'
> rolename = "user_role"
> dbname = "demodb"
> schemaname = "public"
> warehousename = "load_wh"
>
>
> [snowpark]
> accountname = "myaccount"
> username = "user2"
> password = 'yyyyy'
> rolename = "user_role"
> dbname = "demodb"
> schemaname = "public"
> warehousename = "load_wh"
>
> ```


The `appName` can use to setup a query_tag like `APPNAME=tag;execution_id=guid` which can then be used to track job actions with a query like

You can then use a query like:
Expand Down Expand Up @@ -192,7 +233,6 @@ df.group_by("ID").applyInPandas(
normalize, schema="id long, v double").show()
```


```
------------------------------
|"ID" |"V" |
Expand All @@ -205,7 +245,6 @@ df.group_by("ID").applyInPandas(
------------------------------
```


> NOTE: since snowflake-snowpark-python==1.8.0 applyInPandas is available. This version is kept because:
>
> 1. It supports string schemas
Expand Down Expand Up @@ -285,7 +324,7 @@ That will return:
| functions.format_number | formats numbers using the specified number of decimal places |
| ~~functions.reverse~~ | ~~returns a reversed string~~ **Available in snowpark-python >= 1.2.0** |
| ~~functions.explode~~ | ~~returns a new row for each element in the given array~~ **Available in snowpark-python >= 1.4.0** |
| functions.explode_outer | returns a new row for each element in the given array or map. Unlike explode, if the array/map is null or empty then null is producedThis |
| ~~functions.explode_outer~~ | ~~returns a new row for each element in the given array or map. Unlike explode, if the array/map is null or empty then null is producedThis~~<br />**Available in snowpark-python >= 1.4.0**<br />There is a breaking change as the explode_outer does not need the map argument anymore. |
| functions.arrays_zip | returns a merged array of arrays |
| functions.array_sort | sorts the input array in ascending order. The elements of the input array must be orderable. Null elements will be placed at the end of the returned array. |
| functions.array_max | returns the maximon value of the array. |
Expand Down Expand Up @@ -352,7 +391,7 @@ sf_df = session.createDataFrame([(1, ["foo", "bar"], {"x": 1.0}), (2, [], {}), (
```

```
# +---+----------+----------+
# +---+----------+----------+
# | id| an_array| a_map|
# +---+----------+----------+
# | 1|[foo, bar]|{x -> 1.0}|
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
this_directory = Path(__file__).parent
long_description = (this_directory / "README.md").read_text()

VERSION = '0.0.33'
VERSION = '0.0.34'

setup(name='snowpark_extensions',
version=VERSION,
Expand Down
4 changes: 0 additions & 4 deletions snowpark_extensions/functions_extensions.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,6 @@ def create_map(*col_names):
col_list.append(_to_col_if_str(name,"create_map"))
col_list.append(value)
return object_construct(*col_list)

def _explode_outer(col,map=None):
return F.table_function("flatten")(input=col,outer=F.lit(True))

def _array(*cols):
return F.array_construct(*cols)
Expand Down Expand Up @@ -350,7 +347,6 @@ def map_values(obj:dict)->list:
F.array_sort = _array_sort
F.arrays_zip = _arrays_zip
F.create_map = create_map
F.explode_outer = _explode_outer
F.format_number = format_number
F.flatten = _array_flatten
F.map_values = _map_values
Expand Down
2 changes: 1 addition & 1 deletion tests/test_dataframe_extensions.py
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ def test_explode_outer_with_map():
# | 3| null| null|
# +---+----------+----------+

results = sf_df.select("id", "an_array", explode_outer("a_map",map=True)).collect()
results = sf_df.select("id", "an_array", explode_outer("a_map")).collect()
# +---+----------+----+-----+
# | id| an_array| KEY| VALUE|
# +---+----------+----+-----+
Expand Down

0 comments on commit f8bcf49

Please sign in to comment.