Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running ssas_api.py and getting connected to AAS issue #12

Closed
nisharncslabs opened this issue May 4, 2021 · 10 comments
Closed

Running ssas_api.py and getting connected to AAS issue #12

nisharncslabs opened this issue May 4, 2021 · 10 comments

Comments

@nisharncslabs
Copy link

nisharncslabs commented May 4, 2021

Hi, I followed the steps in the README.md file and imported the required dll extension (and edited the required version info in the python file path snippet provided as well).

However, once I go to run the a version of the Quickstart I run into an error: NameError: name 'DataTable' is not defined.

Here is the python file path snippet with the correct version and path that I have on my machine:

base = "C:/Program Files/PackageManagement/NuGet/Packages/Microsoft.AnalysisServices"
_version = "19.20.1.0"  # at time of this writing
AMO_PATH = f"{base}.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.Tabular.dll"
ADOMD_PATH = f"{base}.AdomdClient.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.AdomdClient.dll"

My question is:

  • Do I need to edit the actual ssas_api.py script with the file path locations of the .dll files? Not entirely sure what would be required to edit from this in order for the Quickstart snippet to run. Any advice on editing the ssas_api.py script as per the python file path snippets above would be useful.
  • I also noticed midway down the ssas_api.py script I have on my local machine (line 70-73) the following commands aren't working (greyed out in PyCharm Editor):
import System
from System.Data import DataTable
import Microsoft.AnalysisServices.Tabular as AMO
import Microsoft.AnalysisServices.AdomdClient as ADOMD

Not sure if I need to install a package for this to work. I tried performing pip install System but that didn't seem to work, it gave the following error ERROR: Could not find a version that satisfies the requirement System (from versions: none)

@yehoshuadimarsky
Copy link
Owner

Can you post your complete code example that fails?

@nisharncslabs
Copy link
Author

nisharncslabs commented May 5, 2021

Sure, so I have 2 files in my editor (ssas_api.py copy exactly same as in repo)

  1. Quick start script with code to be run to get dataframe back from DAX query ping to AAS server.
  2. ssas_api.py copy file containing functions which can connect to AAS.

These are both shown down below.

My main question is: I am not too sure what additional things I need to do along with having these files in order to run/get the DAX query data successfully from a particular AAS server.

  • I have confirmed the location of both the prerequisite dll files as being given in the python snippet in the Quickstart file below, not sure if I have to change this in the ssas_api.py file where it says root.
  • I have noticed that in the ssas_api.py file the import System command and subsequent ones below are greyed out as if they arent found. Do I need to pip install something inorder for these to work?

File 1: QuickStart

import ssas_api
import clr

'''
# Python Snippet showing where the required dll files are located

base = "C:/Program Files/PackageManagement/NuGet/Packages/Microsoft.AnalysisServices"
_version = "19.20.1.0"  # at time of this writing
AMO_PATH = f"{base}.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.Tabular.dll"
ADOMD_PATH = f"{base}.AdomdClient.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.AdomdClient.dll"
'''

#_load_assemblies(amo_path=amo_path, adomd_path=adomd_path)


conn = ssas_api.set_conn_string(
    server='<aas_servername>',
    db_name='<db_name>',
    username='<email>',
    password='<password>'
)



dax_string = '''  
MY DAX QUERY
'''

df = ssas_api.get_DAX(connection_string=conn, dax_string=dax_string)

File 2: ssas_api.py script for AAS connection

# -*- coding: utf-8 -*-
"""
Created on Wed Sep 20 16:59:43 2017

@author: Yehoshua
"""

import pandas as pd
import numpy as np
from functools import wraps
from pathlib import Path
import logging
import warnings

logger = logging.getLogger(__name__)

try:
    import clr  # name for pythonnet
except ImportError:
    msg = """
    Could not import 'clr', install the 'pythonnet' library. 
    For conda, `conda install -c pythonnet pythonnet`
    """
    raise ImportError(msg)


def _load_assemblies(amo_path=None, adomd_path=None):
    """
    Loads required assemblies, called after function definition.
    Might need to install SSAS client libraries:
    https://docs.microsoft.com/en-us/azure/analysis-services/analysis-services-data-providers

    Parameters
    ----------
    amo_path : str, default None
        The full path to the DLL file of the assembly for AMO.
        Should end with '**Microsoft.AnalysisServices.Tabular.dll**'
        Example: C:/my/path/to/Microsoft.AnalysisServices.Tabular.dll
        If None, will use the default location on Windows.
    adomd_path : str, default None
        The full path to the DLL file of the assembly for ADOMD.
        Should end with '**Microsoft.AnalysisServices.AdomdClient.dll**'
        Example: C:/my/path/to/Microsoft.AnalysisServices.AdomdClient.dll
        If None, will use the default location on Windows.
    """
    # Full path of .dll files
    root = Path(r"C:\Windows\Microsoft.NET\assembly\GAC_MSIL")
    # get latest version of libraries if multiple libraries are installed (max func)
    if amo_path is None:
        amo_path = str(
            max((root / "Microsoft.AnalysisServices.Tabular").iterdir())
            / "Microsoft.AnalysisServices.Tabular.dll"
        )
    if adomd_path is None:
        adomd_path = str(
            max((root / "Microsoft.AnalysisServices.AdomdClient").iterdir())
            / "Microsoft.AnalysisServices.AdomdClient.dll"
        )

    # load .Net assemblies
    logger.info("Loading .Net assemblies...")
    clr.AddReference("System")
    clr.AddReference("System.Data")
    clr.AddReference(amo_path)
    clr.AddReference(adomd_path)

    # Only after loaded .Net assemblies
    global System, DataTable, AMO, ADOMD

    **import System
    from System.Data import DataTable
    import Microsoft.AnalysisServices.Tabular as AMO
    import Microsoft.AnalysisServices.AdomdClient as ADOMD**

    logger.info("Successfully loaded these .Net assemblies: ")
    for a in clr.ListAssemblies(True):
        logger.info(a.split(",")[ 0 ])


def _assert_dotnet_loaded(func):
    """
    Wrapper to make sure that required .NET assemblies have been loaded and imported.
    Can pass the keyword arguments 'amo_path' and 'adomd_path' to any annotated function,
    it will use them in the `_load_assemblies` function.

    Example:
        .. code-block:: python

            import ssas_api
            conn = ssas_api.set_conn_string(
                's', 'd', 'u', 'p',
                amo_path='C:/path/number/one',
                adomd_path='C:/path/number/two'
            )
    """

    @wraps(func)
    def wrapper(*args, **kwargs):
        amo_path = kwargs.pop("amo_path", None)
        adomd_path = kwargs.pop("adomd_path", None)
        try:
            type(DataTable)
        except NameError:
            # .NET assemblies not loaded/imported
            logger.warning(".Net assemblies not loaded and imported, doing so now...")
            _load_assemblies(amo_path=amo_path, adomd_path=adomd_path)
        return func(*args, **kwargs)

    return wrapper


@_assert_dotnet_loaded
def set_conn_string(server, db_name, username, password):
    """
    Sets connection string to SSAS database,
    in this case designed for Azure Analysis Services
    """
    conn_string = (
        "Provider=MSOLAP;Data Source={};Initial Catalog={};User ID={};"
        "Password={};Persist Security Info=True;Impersonation Level=Impersonate".format(
            server, db_name, username, password
        )
    )
    return conn_string


@_assert_dotnet_loaded
def get_DAX(connection_string, dax_string):
    """
    Executes DAX query and returns the results as a pandas DataFrame

    Parameters
    ---------------
    connection_string : string
        Valid SSAS connection string, use the set_conn_string() method to set
    dax_string : string
        Valid DAX query, beginning with EVALUATE or VAR or DEFINE

    Returns
    ----------------
    pandas DataFrame with the results
    """
    table = _get_DAX(connection_string, dax_string)
    df = _parse_DAX_result(table)
    return df


def _get_DAX(connection_string, dax_string) -> "DataTable":
    dataadapter = ADOMD.AdomdDataAdapter(dax_string, connection_string)
    table = DataTable()
    logger.info("Getting DAX query...")
    dataadapter.Fill(table)
    logger.info("DAX query successfully retrieved")
    return table


def _parse_DAX_result(table: "DataTable") -> pd.DataFrame:
    cols = [ c for c in table.Columns.List ]
    rows = [ ]
    # much better performance to just access data by position instead of name
    # and then add column names afterwards
    for r in range(table.Rows.Count):
        row = [ table.Rows[ r ][ c ] for c in cols ]
        rows.append(row)

    df = pd.DataFrame.from_records(rows, columns=[ c.ColumnName for c in cols ])

    # replace System.DBNull with None
    # df.replace({System.DBNull: np.NaN}) doesn't work for some reason
    df = df.applymap(lambda x: np.NaN if isinstance(x, System.DBNull) else x)

    # convert datetimes
    dt_types = [ c.ColumnName for c in cols if c.DataType.FullName == "System.DateTime" ]
    if dt_types:
        for dtt in dt_types:
            # if all nulls, then pd.to_datetime will fail
            if not df.loc[ :, dtt ].isna().all():
                # https://docs.microsoft.com/en-us/dotnet/standard/base-types/standard-date-and-time-format-strings#Sortable
                ser = df.loc[ :, dtt ].map(lambda x: x.ToString('s'))
                df.loc[ :, dtt ] = pd.to_datetime(ser)

    # convert other types
    types_map = {"System.Int64": int, "System.Double": float, "System.String": str}
    col_types = {c.ColumnName: types_map.get(c.DataType.FullName, "object") for c in cols}

    # handle NaNs (which are floats, as of pandas v.0.25.3) in int columns
    col_types_ints = {k for k, v in col_types.items() if v == int}
    ser = df.isna().any(axis=0)
    col_types.update({k: float for k in set(ser[ ser ].index).intersection(col_types_ints)})

    # convert
    df = df.astype(col_types)

    return df


@_assert_dotnet_loaded
def process_database(connection_string, refresh_type, db_name):
    process_model(
        connection_string=connection_string,
        item_type="model",
        refresh_type=refresh_type,
        db_name=db_name,
    )


@_assert_dotnet_loaded
def process_table(connection_string, table_name, refresh_type, db_name):
    process_model(
        connection_string=connection_string,
        item_type="table",
        item=table_name,
        refresh_type=refresh_type,
        db_name=db_name,
    )


@_assert_dotnet_loaded
def process_model(connection_string, db_name, refresh_type="full", item_type="model", item=None):
    """
    Processes SSAS data model to get new data from underlying source.

    Parameters
    -------------
    connection_string : string
        Valid SSAS connection string, use the set_conn_string() method to set
    db_name : string
        The data model on the SSAS server to process
    refresh_type : string, default `full`
        Type of refresh to process. Currently only supports `full`.
    item_type : string, choice of {'model','table'}, default 'model'
    item : string, optional.
        Then name of the item. Only needed when item_type is 'table', to specify the table name
    """
    assert item_type.lower() in ("table", "model"), f"Invalid item type: {item_type}"
    if item_type.lower() == "table" and not item:
        raise ValueError("If item_type is table, must supply an item (a table name) to process")

    # connect to the AS instance from Python
    AMOServer = AMO.Server()
    logger.info("Connecting to database...")
    AMOServer.Connect(connection_string)

    # Dict of refresh types
    refresh_dict = {"full": AMO.RefreshType.Full}

    # process
    db = AMOServer.Databases[ db_name ]

    if item_type.lower() == "table":
        table = db.Model.Tables.Find(item)
        table.RequestRefresh(refresh_dict[ refresh_type ])
    else:
        db.Model.RequestRefresh(refresh_dict[ refresh_type ])

    op_result = db.Model.SaveChanges()
    if op_result.Impact.IsEmpty:
        logger.info("No objects affected by the refresh")

    logger.info("Disconnecting from Database...")
    # Disconnect
    AMOServer.Disconnect()

@yehoshuadimarsky
Copy link
Owner

No you do not need to install System or any of the .Net imports, those are .Net libraries that we are loading through Python using Pythonnet. This should work as written. Is it working?

@nisharncslabs
Copy link
Author

nisharncslabs commented May 5, 2021

No it doesn't seem to be working. The errors I find when I try running my Quickstart file (as stated above) are:

  1. NameError: name 'DataTable' is not defined: Which I think is caused by the for line of code in the ssas_api.py script not being detected.
  2. During handling of the above exception, another exception occurred: FileNotFoundError: [WinError 3] The system cannot find the path specified: 'C:\\Windows\\Microsoft.NET\\assembly\\GAC_MSIL\\Microsoft.AnalysisServices.Tabular' : Which I think is the result of the .dll files not being in the location required/ assumed by your ssas_api.py script? As a result might need to change the .dll file location in the script using the python snippet provided? I might be wrong about this?

I just want to ensure that if we take the Quickstart & ssas_api.py (as is) into the same directory and run the Quickstart it should work as is? I am not entirely sure if I need to change anything or am doing something wrong. I just want to be able send a DAX query against AAS and receive a response and store the result in a dataframe. Not sure if you are able to provide further guidance on how to do this?

@nisharncslabs
Copy link
Author

I did also follow the steps in the README.md "Getting The Required .Net Libraries" regarding the .net libraries and confirmed that the following python snippet applies in my case:

base = "C:/Program Files/PackageManagement/NuGet/Packages/Microsoft.AnalysisServices"
_version = "19.20.1.0"  # at time of this writing
AMO_PATH = f"{base}.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.Tabular.dll"
ADOMD_PATH = f"{base}.AdomdClient.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.AdomdClient.dll"

Do I need to copy this info into the ssas_api.py file in anyway for it to run?

@yehoshuadimarsky
Copy link
Owner

yes that is the issue. your .Net DLLs are not being found. All of the functions optionally take params to specify where they are (passing to the underlying _load_assemblies decorator function), if no params are passed then it looks in the default locations, see the code in the snippet below.

python-ssas/ssas_api.py

Lines 47 to 59 in fc18937

# Full path of .dll files
root = Path(r"C:\Windows\Microsoft.NET\assembly\GAC_MSIL")
# get latest version of libraries if multiple libraries are installed (max func)
if amo_path is None:
amo_path = str(
max((root / "Microsoft.AnalysisServices.Tabular").iterdir())
/ "Microsoft.AnalysisServices.Tabular.dll"
)
if adomd_path is None:
adomd_path = str(
max((root / "Microsoft.AnalysisServices.AdomdClient").iterdir())
/ "Microsoft.AnalysisServices.AdomdClient.dll"
)

So if it's not finding it in the default location, you should pass the location as params, such as this

df = ssas_api.get_DAX(
    connection_string=conn, 
    dax_string=dax_string, 
    amo_path='C:/path/number/one',
    adomd_path='C:/path/number/two'
)

Example in the code/docstring:

python-ssas/ssas_api.py

Lines 83 to 96 in fc18937

Wrapper to make sure that required .NET assemblies have been loaded and imported.
Can pass the keyword arguments 'amo_path' and 'adomd_path' to any annotated function,
it will use them in the `_load_assemblies` function.
Example:
.. code-block:: python
import ssas_api
conn = ssas_api.set_conn_string(
's', 'd', 'u', 'p',
amo_path='C:/path/number/one',
adomd_path='C:/path/number/two'
)
"""

@nisharncslabs
Copy link
Author

nisharncslabs commented May 5, 2021

Ok, that makes sense however even when I used the file paths as specififed ie AMO_PATH & ADOMD_PATH in the fashion that you are suggesting (inside the get_DAX function) it stills throws the error. I have attached a screenshot of the error and the code chunk I think its referring too.

2021-05-05 (3)_LI
2021-05-05 (5)_LI

@yehoshuadimarsky
Copy link
Owner

That doesn't make sense.

  • I don't see the exact error in the screen shot, what's the error?
  • what's your code?

@nisharncslabs
Copy link
Author

nisharncslabs commented May 5, 2021

The error that I am still getting is the same as what I described above.

  • After I tried the suggestion of forcefully specifying the file path of the .dll extension files it still comes up with the same error.
  • As you can see in the 1st screen shot the error (the stuff in red at the bottom) mentions NameError: DataTable not defined and if you look above you can see the greyed out code that I was mentioning. For some reason this code isn't being detected by Pycharm IDE not sure what I am missing or why this is but as a result it seems the ssas_api.py script doesn't work/execute as intended and hence cannot define the DataTable as the error says.
  • The 2nd screenshot shows the last part of the error (also described in an earlier post) about how the file path can't be found even though above I have forceibly specified the file path of the .dll files as you confirmed was what should be done.

@dominik-hayd
Copy link

Hey @nisharncslabs,
I know it's been a while since you created this issue and I'm not the owner of this project. But since i encountered the same issues I want to share how I solved them and my understanding on whats the problem in your case.

First of all: The lines 70-73 should be highlighted by your IDE. It reads these imports and looks for matching python packages, which are obviously not available. The imports are loaded through pythonnet from the .DLL files. But this is just a warning and can be safely ignored, because we know why they occur.

Now to your error message. I think python is a bit misleading in this case. It reports two exceptions:

  1. The NameError: DataTable not defined. The exception is thrown in line 102. @yehoshuadimarsky accesses the type of DataTable there to check whether the .Net dlls are already loaded. So at the first execution this check has to throw an exception to trigger the dll loading, which is done in the except block (line 107).
  2. The FileNotFound error. This comes from pythonnet trying to load one of the dll files. This is the real problem.

Because the second exception is thrown in the except block python shows you both exception.

Now to solve the problem:
(I can only tell you how I solved it 🙈)

I copied both .dll files (Tabular and AdomdClient) to the root directory of my project.
In my first call to python_ssas I passed absolute paths to both files.
conn = powerbi.set_conn_string('workspace', 'datset', 'user', 'password', amo_path='C:\\Users\\...\\Microsoft.AnalysisServices.Tabular.dll', adomd_path='C:\\Users\\...\\Microsoft.AnalysisServices.AdomdClient.dll')

For me this solves the error.
If this isn't working for you play around with importing the dlls for a bit:
Create a new python file to test just the import without an extra stuff.

import clr
clr.AddReference("System")
clr.AddReference("System.Data")
clr.AddReference(<path to your tabular dll>)
clr.AddReference(<path to your adomdclient dll>)

import System
from System.Data import DataTable
import Microsoft.AnalysisServices.Tabular as AMO
import Microsoft.AnalysisServices.AdomdClient as ADOMD

Adjust the paths until you get this script running and then supply the same paths to your first call to one of the python_ssas functions.

Hope this helps 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants