Microsoft Fabric #1198

rovin-ms · 2024-01-12T13:54:23Z

Expected behavior

I'm attempting to setup Sedona in Microsoft Fabric. I've loaded the Python packages and the Jar files.

Actual behavior

When I run this code:

import geopandas as gpd

from sedona.spark import *

SedonaContext.builder().config("spark.sql.autoBroadcastJoinThreshold", "10485760")

config = SedonaContext.builder() .
config('spark.jars.packages',
'org.apache.sedona:sedona-spark-shaded-3.0_2.12:1.5.0,'
'org.datasyslab:geotools-wrapper:1.5.0-28.2').
getOrCreate()

sedona = SedonaContext.create(config)
I receive this error:

TypeError Traceback (most recent call last)
Cell In[19], line 13
5 SedonaContext.builder().config("spark.sql.autoBroadcastJoinThreshold", "10485760")
7 config = SedonaContext.builder() .
8 config('spark.jars.packages',
9 'org.apache.sedona:sedona-spark-shaded-3.0_2.12:1.5.0,'
10 'org.datasyslab:geotools-wrapper:1.5.0-28.2').
11 getOrCreate()
---> 13 sedona = SedonaContext.create(config)

File ~/cluster-env/trident_env/lib/python3.10/site-packages/sedona/spark/SedonaContext.py:38, in SedonaContext.create(cls, spark)
36 spark.sql("SELECT 1 as geom").count()
37 PackageImporter.import_jvm_lib(spark._jvm)
---> 38 spark._jvm.SedonaContext.create(spark._jsparkSession)
39 return spark

TypeError: 'JavaPackage' object is not callable

Steps to reproduce the problem

Load all the Python packages in the Public Libraries of Fabric:

shapely="<=1.8.5"
pandas="<=1.3.5"
geopandas="<=0.10.2"
pyspark=">=2.3.0"
attrs=""
pyarrow=""
keplergl = "==0.3.2"
pydeck = "===0.8.0"

Settings

Sedona version = 1.5

Apache Spark version = 3.3.1.5.2-108696741

Apache Flink version = N/A

API type = Scala, Java, Python? Python

Scala version = 2.11, 2.12, 2.13? 2.12

JRE version = 1.8, 1.11?

Python version = 3.10

Environment = Standalone, AWS EC2, EMR, Azure, Databricks? Microsoft Fabric with Runtime 1.1 and 1.2.

rovin-ms · 2024-01-12T19:14:14Z

Found the solution. I placed the Jar files for Sedona in an Azure Blob Storage container, and then set the %%configure magic before running the code above. Here's the doc: https://learn.microsoft.com/en-us/fabric/data-engineering/author-execute-notebook#spark-session-configuration-magic-command

And here's close to what was added to the first cell in a Notebook:
%%configure -f
{
"jars": ["https://xxxxxx.blob.core.windows.net/jars/sedona-spark-shaded-3.0_2.12-1.5.0.jar", "https://xxxxxx.blob.core.windows.net/jars/geotools-wrapper-1.5.0-28.2.jar"]

}

I was then able to set the SedonaContext. Note that adding the libraries under the Workspace libraries did not appear to work. https://learn.microsoft.com/en-us/fabric/data-engineering/environment-manage-library

Sarwat · 2024-01-18T18:37:20Z

Rob, this is great! Would you like to contribute a “Installing Sedona on MS Fabric Guide” and contribute that to the Sedona docs. I am sure many users will find it quite useful!

…

On Thu, Jan 18, 2024 at 9:57 AM Ron Vincent ***@***.***> wrote: Closed #1198 <#1198> as completed. — Reply to this email directly, view it on GitHub <#1198 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAQSPNRNKBIQINYHGWO2XC3YPFPARAVCNFSM6AAAAABBYFQAH2VHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJRGUZDQNBVGU3DQNI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

rovin-ms · 2024-01-25T14:16:41Z

Hi Mo, Thanks for replying and, yes, I would be happy to write this up. Can you let me know the steps? Thanks! Ron RonVincent Geospatial Architect | Apps, Data & AI Microsoft Federal | Defense Business Unit tel: 704.819.7079 | ***@***.******@***.***> ***@***.*** From: Mo Sarwat ***@***.***> Sent: Thursday, January 18, 2024 1:38 PM To: apache/sedona ***@***.***> Cc: Author ***@***.***>; Comment ***@***.***>; State change ***@***.***> Subject: Re: [apache/sedona] Microsoft Fabric (Issue #1198) Rob, this is great! Would you like to contribute a "Installing Sedona on MS Fabric Guide" and contribute that to the Sedona docs. I am sure many users will find it quite useful!

On Thu, Jan 18, 2024 at 9:57 AM Ron Vincent ***@***.***<mailto:***@***.***>> wrote: Closed #1198 <#1198> as completed. - Reply to this email directly, view it on GitHub <#1198 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAQSPNRNKBIQINYHGWO2XC3YPFPARAVCNFSM6AAAAABBYFQAH2VHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJRGUZDQNBVGU3DQNI> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***<mailto:***@***.***>>

- Reply to this email directly, view it on GitHub<#1198 (comment)> or unsubscribe<https://github.com/notifications/unsubscribe-auth/AHHYCF3NJTU7CGCSYGFKN4LYPFTWZBFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTAVFOZQWY5LFUVUXG43VMWSG4YLNMWVXI2DSMVQWIX3UPFYGLLDTOVRGUZLDORPXI6LQMWWES43TOVSUG33NNVSW45FGORXXA2LDOOJIFJDUPFYGLKTSMVYG643JORXXE6NFOZQWY5LFVAZTINJTGM3DONMCUR2HS4DFUVUXG43VMWSXMYLMOVS2UMRQG44DQOJYGQ2DDJ3UOJUWOZ3FOKTGG4TFMF2GK>. You are receiving this email because you authored the thread. Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

jiayuasu · 2024-01-25T18:12:56Z

@rovin-ms You can compile the Sedona doc locally (https://sedona.apache.org/1.5.1/setup/compile/#compile-the-documentation):

Install libs

pip install mkdocs
pip install mkdocs-material
pip install mkdocs-macros-plugin
pip install mkdocs-git-revision-date-localized-plugin
pip install mike

Run:

mkdocs serve

You can add a page here: https://github.com/apache/sedona/tree/master/docs/setup , then add it to mkdocs.yml. Then it will show up here like the tutorial for Databricks, EMR: https://sedona.apache.org/1.5.1/setup/emr/

robertnagy1 · 2024-03-01T13:01:25Z

I can confirm that this is an issue in Fabric, quite annoying that this has to be set at a session level every time, requiring a restart of the spark pool.

adild2k · 2024-03-10T18:51:15Z

Those could able to resolve the issue of using Apache Sedona in Microsoft Fabric, please could you lay down the steps in some sequential order so that the others can follow it properly.

Thanks in advance,

Regards
Adil

robertnagy1 · 2024-03-11T07:05:05Z

I guess rovin-ms described that how he made it. I followed that as well and it works. The issue is that it adds about 2-4 minutes worth of time to computations.

adild2k · 2024-03-11T07:16:32Z

i was not clear on the jar files where those will need to be placed

robertnagy1 · 2024-03-11T07:19:26Z

i have them on a blob storage, and i am accessing them through https just like rovin-ms. But you could might as well host them on github or whatever place cause it would work as well.

adild2k · 2024-03-11T07:23:23Z

thanks for the response. I will definitely host those files on Azure blob storage with the help of my company Azure specialist. i will get back if i come across with the issues.

adild2k · 2024-03-11T07:25:27Z

https://jar-download.com/download-handling.php

is this the right side for the jar files to be get downloaded?

robertnagy1 · 2024-03-11T07:27:14Z

i think you should use one of the maven repositories and match your scala and spark version : https://mvnrepository.com/search?q=sedona

adild2k · 2024-03-11T08:19:58Z

Sorry, for being novice in this part, i am not able to find the respective jar files.

Further, is there any folder name along with the rights that needs to be given ?

jiayuasu · 2024-03-11T08:24:09Z

Hi @adild2k , below are the Sedona jars you will need:

Spark 3.0 to 3.3 + Scala 2.12: https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.0_2.12/1.5.1/
Spark 3.4 + Scala 2.12: https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.4_2.12/1.5.1/
Spark 3.5 + Scala 2.12:https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.5_2.12/1.5.1/

In addition, you will need geotools-wrapper: https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/1.5.1-28.2/

For example, sedona-spark-shaded-3.5_2.12-1.5.1.jar

adild2k · 2024-03-11T08:54:00Z

Thanks again for the sweet response.

Further, is there any specific folder name on Azure Blob storage that these files needs to be placed along with the rights that needs to be given?

robertnagy1 · 2024-03-11T09:25:21Z

It does not have to be stored in AZ blob storage, but if you store it there make sure that the container you are trying to reach and folder are reachable without any kind of Authentication. That config magic command will only work in that form if it is reachable without any kind of Auth.

adild2k · 2024-03-11T09:35:17Z

Thanks for the response. I have save the files in the respective folder on Azure, check the snapshot below. After that i need to follow this step mentioned above. Is my understanding right?

%%configure -f
{
"jars": ["https://xxxxxx.blob.core.windows.net/jars/sedona-spark-shaded-3.0_2.12-1.5.0.jar", "https://xxxxxx.blob.core.windows.net/jars/geotools-wrapper-1.5.0-28.2.jar"]

}

robertnagy1 · 2024-03-11T09:46:34Z

run it inline in a notebook and it will take some time to fire up the new spark cluster.

adild2k · 2024-03-11T09:54:13Z

Please check the snapshot below, what should be the next step?

robertnagy1 · 2024-03-11T10:00:47Z

Follow the tutorial from the official website on creating the sedona context.

adild2k · 2024-03-11T10:40:25Z

Sorry being a novice in this area, please check the snapshot. i know there is something missing which i am not able to figure it out

robertnagy1 · 2024-03-11T11:55:54Z

you have to add the python libraries as well, apache-sedona geopandas, 0.11 i think keplergl and pydeck

adild2k · 2024-03-11T12:13:32Z

i have already install keplergl and pydeck earlier along with geopandas. Still its giving the same error

robertnagy1 · 2024-03-11T12:19:18Z

what about the apache-sedona python module? have you installed that as well?

adild2k · 2024-03-11T12:29:29Z

Yes, off course. that was the first one that has been installed

adild2k · 2024-03-12T07:26:42Z

still the same error. Any suggestions or pointers?

robertnagy1 · 2024-04-30T08:27:27Z

By the way, I forwarded a ticket to Microsoft regarding this. It is an environment problem which they should fix. We shouldn't do any work arounds.

jiayuasu · 2024-04-30T08:34:32Z

Thanks guys. I recently created a tutorial about installing Sedona on Fabric. This will be published to Sedona website soon

https://github.com/apache/sedona/blob/master/docs/setup/fabric.md

robertnagy1 · 2024-04-30T08:37:24Z

@jiayuasu isn't it better to refer to the jar files from the maven repo? I did something like this, so i don't need to host the jar files on remote storage.
%%configure -f
{
"jars": ["https://repo1.maven.org/maven2/org/datasyslab/geotools-wrapper/1.5.1-28.2/geotools-wrapper-1.5.1-28.2.jar", "https://repo1.maven.org/maven2/org/apache/sedona/sedona-spark-shaded-3.0_2.12/1.5.1/sedona-spark-shaded-3.0_2.12-1.5.1.jar"]
}

jiayuasu · 2024-04-30T08:49:41Z

@robertnagy1 Good point. But does Microsoft Fabric always have internet access? Will some user intentionally shut it down for security purpose?

robertnagy1 · 2024-04-30T08:53:24Z

I guess having it on a remote abfss requires internet as well, and accessing Fabric requires internet, and the spark clusters require internet. The Lakehouse (as far as i know, but i might be wrong) is an abstraction layer above Microsoft One Lake, which is separate from the Spark Instances. I think it should be safe to assume that Fabric requires internet to work.

jiayuasu · 2024-04-30T08:55:09Z

Makes sense. Will update the doc accordingly.

rovin-ms closed this as completed Jan 18, 2024

jiayuasu linked a pull request Apr 22, 2024 that will close this issue

[DOCS] Add Microsoft Fabric tutorial #1350

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Microsoft Fabric #1198

Microsoft Fabric #1198

rovin-ms commented Jan 12, 2024 •

edited

Loading

rovin-ms commented Jan 12, 2024

Sarwat commented Jan 18, 2024 via email

rovin-ms commented Jan 25, 2024 via email

jiayuasu commented Jan 25, 2024

robertnagy1 commented Mar 1, 2024

adild2k commented Mar 10, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

jiayuasu commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

adild2k commented Mar 12, 2024

robertnagy1 commented Apr 30, 2024

jiayuasu commented Apr 30, 2024

robertnagy1 commented Apr 30, 2024

jiayuasu commented Apr 30, 2024 •

edited

Loading

robertnagy1 commented Apr 30, 2024

jiayuasu commented Apr 30, 2024

Microsoft Fabric #1198

Microsoft Fabric #1198

Comments

rovin-ms commented Jan 12, 2024 • edited Loading

Expected behavior

Actual behavior

Steps to reproduce the problem

Settings

rovin-ms commented Jan 12, 2024

Sarwat commented Jan 18, 2024 via email

rovin-ms commented Jan 25, 2024 via email

jiayuasu commented Jan 25, 2024

robertnagy1 commented Mar 1, 2024

adild2k commented Mar 10, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

jiayuasu commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

robertnagy1 commented Mar 11, 2024

adild2k commented Mar 11, 2024

adild2k commented Mar 12, 2024

robertnagy1 commented Apr 30, 2024

jiayuasu commented Apr 30, 2024

robertnagy1 commented Apr 30, 2024

jiayuasu commented Apr 30, 2024 • edited Loading

robertnagy1 commented Apr 30, 2024

jiayuasu commented Apr 30, 2024

rovin-ms commented Jan 12, 2024 •

edited

Loading

jiayuasu commented Apr 30, 2024 •

edited

Loading