<h1 style="font-size:42px; text-align:center; margin-bottom:30px;"><span style="color:SteelBlue">TM1py:</span>Reading Data</h1>
<hr>
Going through all the different ways to get data into your Python scripts

## Part 1: Reading data from a CSV file
Introduction to [pandas](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)

In [1]:
#import pandas to get data from csv file
import pandas as pd

In [2]:
# pd.read_csv will store the information into a pandas dataframe called df
df = pd.read_csv('reading_data.csv')

In [4]:
#A pandas dataframe has lots of cool pre-built functions such as:
# print the result
df.head()

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,1/12/2010 8:26,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,1/12/2010 8:26,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,1/12/2010 8:26,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,1/12/2010 8:26,3.39,17850.0,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,1/12/2010 8:26,3.39,17850.0,United Kingdom


In [6]:
#write data to csv
df.to_csv('my_new_filePyPal.csv')

In [7]:
# Find all unique values for one column
df.Country.unique()

array(['United Kingdom', 'France', 'Australia', 'Netherlands', 'Germany',
       'Norway', 'EIRE', 'Switzerland', 'Spain', 'Poland', 'Portugal',
       'Italy', 'Belgium', 'Lithuania'], dtype=object)

<h2> Part 2: Reading data from TM1 </h2>
<span> Going through all TM1py options to load data from TM1 into our Jupyter notebook</span>

<h3>Setting up connection to TM1</h3>

In [8]:
#import TM1py services
from TM1py.Services import TM1Service
from TM1py.Utils import Utils

In [9]:
#TM1 credentials
ADDRESS = "localhost"
PORT = 8009
USER = "admin"
PWD = "apple"
SSL = True

In [10]:
#Connect to the TM1 instance
tm1 = TM1Service(address=ADDRESS, port=PORT, user=USER, password=PWD, ssl=SSL)

In [11]:
# Cube view used in this notbook
cube_name = 'Bike Shares'
view_name = '2014 to 2017 Counts by Day'

## Getting the view as map of coordinates and values
- **Use Case:** Get the values with all intersections

In [15]:
# query first 5 cells from the cube view as coordinate-cell dictionary
cells = tm1.cubes.cells.execute_view(cube_name=cube_name, view_name=view_name, private=False, top=5)
cells

{('[Version].[Version].[Actual]', '[Date].[Date].[2014-01-01]', '[City].[City].[NYC]', '[Bike Shares Measure].[Bike Shares Measure].[Count]'): {'Value': 6059}, ('[Version].[Version].[Actual]', '[Date].[Date].[2014-01-01]', '[City].[City].[Chicago]', '[Bike Shares Measure].[Bike Shares Measure].[Count]'): {'Value': 123}, ('[Version].[Version].[Actual]', '[Date].[Date].[2014-01-01]', '[City].[City].[Washington]', '[Bike Shares Measure].[Bike Shares Measure].[Count]'): {'Value': 3011}, ('[Version].[Version].[Actual]', '[Date].[Date].[2014-01-02]', '[City].[City].[NYC]', '[Bike Shares Measure].[Bike Shares Measure].[Count]'): {'Value': 8600}, ('[Version].[Version].[Actual]', '[Date].[Date].[2014-01-02]', '[City].[City].[Chicago]', '[Bike Shares Measure].[Bike Shares Measure].[Count]'): {'Value': 112}}

In [16]:
# print first entries from coordinates-cell dictionary instead of [Version].[Version].[Actual] returns Actual
for element_unique_names, cell in cells.items():
    # extract element names from unique-element-names
    element_names = Utils.element_names_from_element_unique_names(
        element_unique_names=element_unique_names)
    # take value from cell
    value = cell["Value"]
    print(element_names, value)

('Actual', '2014-01-01', 'NYC', 'Count') 6059
('Actual', '2014-01-01', 'Chicago', 'Count') 123
('Actual', '2014-01-01', 'Washington', 'Count') 3011
('Actual', '2014-01-02', 'NYC', 'Count') 8600
('Actual', '2014-01-02', 'Chicago', 'Count') 112


## Getting the number of cells
- **Use Case:** Check how many cells are you going to work with
- **Note**: Very fast

In [17]:
#tm1.cubes.cells.execute_view_csv or tm1.cubes.cells.execute_mdx_csv
%time
df_cellcount= tm1.cubes.cells.execute_view_cellcount(cube_name=cube_name, view_name=view_name, private=False)

Wall time: 0 ns


In [18]:
df_cellcount

4383

## Getting data as CSV fomat
- **Use Case**: Get your data as CSV format
- **Note**: Very fast

In [19]:
#tm1.cubes.cells.execute_view_csv or tm1.cubes.cells.execute_mdx_csv
%time
csv = tm1.cubes.cells.execute_view_csv(cube_name=cube_name, view_name=view_name, private=False)

Wall time: 0 ns


In [20]:
#diplay the result as CSV format
csv[0:200]

'Date,City,Value\r\n2014-01-01,NYC,6059\r\n2014-01-01,Chicago,123\r\n2014-01-01,Washington,3011\r\n2014-01-02,NYC,8600\r\n2014-01-02,Chicago,112\r\n2014-01-02,Washington,3316\r\n2014-01-03,NYC,1144\r\n2014-01-03,Chica'

In [21]:
#diplay first 10 lines of the result
for line in csv.split("\r\n")[0:10]:
    print(line)

Date,City,Value
2014-01-01,NYC,6059
2014-01-01,Chicago,123
2014-01-01,Washington,3011
2014-01-02,NYC,8600
2014-01-02,Chicago,112
2014-01-02,Washington,3316
2014-01-03,NYC,1144
2014-01-03,Chicago,6
2014-01-03,Washington,1608


In [22]:
#diplay last 20 lines of the result
for line in csv.split("\r\n")[-10:]:
    print(line)

2017-12-29,NYC,13759
2017-12-29,Chicago,1076
2017-12-29,Washington,3088
2017-12-30,NYC,5956
2017-12-30,Chicago,548
2017-12-30,Washington,1876
2017-12-31,NYC,6569
2017-12-31,Chicago,651
2017-12-31,Washington,1437



## Getting data as (pandas) dataframe
- **Use Case**: Get your data as a pandas dataframe
- **Note**: useful for further data analysis in python

In [27]:
%time
df = tm1.cubes.cells.execute_view_dataframe(cube_name=cube_name, view_name=view_name, private=False)

Wall time: 0 ns


In [26]:
df.head()

Unnamed: 0,Date,City,Value
0,2014-01-01,NYC,6059
1,2014-01-01,Chicago,123
2,2014-01-01,Washington,3011
3,2014-01-02,NYC,8600
4,2014-01-02,Chicago,112


In [25]:
df.to_csv(view_name+"Pypal.csv")

## Getting data as (pandas) pivot dataframe 
- **Use Case**: Get your data as a pandas dataframe following your view structure
- **Note**: useful for further data analysis in python

In [29]:
%time
df_pivot = tm1.cubes.cells.execute_view_dataframe_pivot(cube_name=cube_name, view_name=view_name, private=False)

Wall time: 0 ns


In [31]:
# print first 5 records
df_pivot.head()

Unnamed: 0_level_0,Values,Values,Values
City,Chicago,NYC,Washington
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
2014-01-01,123.0,6059.0,3011.0
2014-01-02,112.0,8600.0,3316.0
2014-01-03,6.0,1144.0,1608.0
2014-01-04,205.0,2292.0,2242.0
2014-01-05,33.0,2678.0,2060.0


In [32]:
# print last 5 records
df_pivot.tail()

Unnamed: 0_level_0,Values,Values,Values
City,Chicago,NYC,Washington
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
2017-12-27,1138.0,16365.0,3321.0
2017-12-28,1294.0,13420.0,2866.0
2017-12-29,1076.0,13759.0,3088.0
2017-12-30,548.0,5956.0,1876.0
2017-12-31,651.0,6569.0,1437.0


## Getting data in custom JSON format
- **Use Case**: Query additional information, such as:
    - Cell is RuleDerived
    - Cell is Consolidated
    - Member properties
    - Attribute Values

- **Note**: very flexible. Not fast.

In [33]:
%time
raw_json = tm1.cubes.cells.execute_view_raw(
    cube_name=cube_name, 
    view_name=view_name, 
    private=False, 
    elem_properties=["Type"],
    cell_properties=["RuleDerived", "Value"])

Wall time: 0 ns


In [37]:
# print full response
raw_json

{'@odata.context': '$metadata#Cellsets(Cube(Name,Dimensions+(Name)),Axes(Tuples+(Members+(Name,Element+(Type)))),Cells(Value,RuleDerived))/$entity',
 'ID': 'oGuKUDcCAICTAAAg',
 'Cube': {'Name': 'Bike Shares',
  'Dimensions': [{'Name': 'Version'},
   {'Name': 'Date'},
   {'Name': 'City'},
   {'Name': 'Bike Shares Measure'}]},
 'Axes': [{'Ordinal': 0,
   'Cardinality': 3,
   'Tuples': [{'Ordinal': 0,
     'Members': [{'Name': 'NYC', 'Element': {'Type': 'Numeric'}}]},
    {'Ordinal': 1,
     'Members': [{'Name': 'Chicago', 'Element': {'Type': 'Numeric'}}]},
    {'Ordinal': 2,
     'Members': [{'Name': 'Washington', 'Element': {'Type': 'Numeric'}}]}]},
  {'Ordinal': 1,
   'Cardinality': 1461,
   'Tuples': [{'Ordinal': 0,
     'Members': [{'Name': '2014-01-01', 'Element': {'Type': 'Numeric'}}]},
    {'Ordinal': 1,
     'Members': [{'Name': '2014-01-02', 'Element': {'Type': 'Numeric'}}]},
    {'Ordinal': 2,
     'Members': [{'Name': '2014-01-03', 'Element': {'Type': 'Numeric'}}]},
    {'Ordi

In [38]:
# Extract cube name from response
raw_json['Cube']

{'Name': 'Bike Shares',
 'Dimensions': [{'Name': 'Version'},
  {'Name': 'Date'},
  {'Name': 'City'},
  {'Name': 'Bike Shares Measure'}]}

## Getting cell values
- **Use Case**: sometimes you are only interested in the cell values. Skipping the elements in the response increases performance
- **Note**: Fast and light

In [40]:
%time
values = tm1.cubes.cells.execute_view_values(cube_name=cube_name, view_name=view_name, private=False)

Wall time: 0 ns


In [42]:
# extract first ten values
first_ten = list(values)[0:10]
# print first ten values
print(first_ten)

[]


## Getting row elements and cell values only
- **Use Case**: sometimes elements in columns and titles are irrelevant. Skipping these elements in the response increases performance
- **Note**: Faster than querying everything

In [44]:
rows_and_values = tm1.cubes.cells.execute_view_rows_and_values(
    cube_name=cube_name, 
    view_name=view_name, 
    private=False,
    element_unique_names=False)

AttributeError: 'CellService' object has no attribute 'execute_view_rows_and_values'

In [45]:
for row_elements, values_by_row in rows_and_values.items():
    print(row_elements, values_by_row)

NameError: name 'rows_and_values' is not defined

## Getting data with attributes values
To get attributes values, you will need to get the data from an MDX query

In [46]:
mdx = """
    WITH MEMBER [Bike Shares Measure].[City Alias] AS [}ElementAttributes_City].([}ElementAttributes_City].[City Alias])
    SELECT 
    NON EMPTY {[Date].Members}*{TM1SubsetAll([City])} ON ROWS, 
    NON EMPTY {[Bike Shares Measure].[Count], [Bike Shares Measure].[City Alias] } ON COLUMNS 
    FROM [Bike Shares] 
    WHERE ([Version].[Actual],[Bike Shares Measure].[Count])"""

# get table'ish dataframe
data = tm1.cubes.cells.execute_mdx(mdx)

In [48]:
#Build pandas dataframe
df = Utils.build_pandas_dataframe_from_cellset(data, multiindex=False)
print(df)

      Version        Date        City Bike Shares Measure         Values
1      Actual        2014     Chicago          City Alias               
0      Actual        2014     Chicago               Count        2454634
3      Actual        2014         NYC          City Alias  New York City
2      Actual        2014         NYC               Count        8081216
7      Actual        2014  Total City          City Alias               
6      Actual        2014  Total City               Count       13449000
5      Actual        2014  Washington          City Alias               
4      Actual        2014  Washington               Count        2913150
9      Actual     2014-01     Chicago          City Alias               
8      Actual     2014-01     Chicago               Count          25076
11     Actual     2014-01         NYC          City Alias  New York City
10     Actual     2014-01         NYC               Count         300400
15     Actual     2014-01  Total City          City

In [49]:
# get pivot dataframe
pivot = tm1.cubes.cells.execute_mdx_dataframe_pivot(mdx)
print(pivot)

                              Values          
Bike Shares Measure       City Alias     Count
Date       City                               
2014       Chicago                     2454634
           NYC         New York City   8081216
           Total City                 13449000
           Washington                  2913150
2014-01    Chicago                       25076
           NYC         New York City    300400
           Total City                   438173
           Washington                   112697
2014-01-01 Chicago                         123
           NYC         New York City      6059
           Total City                     9193
           Washington                     3011
2014-01-02 Chicago                         112
           NYC         New York City      8600
           Total City                    12028
           Washington                     3316
2014-01-03 Chicago                           6
           NYC         New York City      1144
           To

## Part 3: Reading from APIs
### Getting data from a web service, introduction to a JSON file
The code below has been extracted from this article: [Upload weather data from web service](https://code.cubewise.com/tm1py-help-content/upload-weather-data-from-web-services-into-planning-analytics)

In [50]:
#library for HTTP / REST Request against Webservices
import requests
#standard library for JSON parsing, manipulation
import json

In [52]:
# Define constants
STATION = 'GHCND:USW00014732'
FROM, TO = '2017-01-01', '2017-01-04'
HEADERS = {"token": 'yyqEBOAbHVbtXkfAmZuPNfnSXvdfyhgn'}

In [53]:
url = 'https://www.ncdc.noaa.gov/cdo-web/api/v2/data?' \
      'datasetid=GHCND&' \
      'startdate=' + FROM + '&' \
      'enddate=' + TO + '&' \
      'limit=1000&' \
      'datatypeid=TMIN&' \
      'datatypeid=TAVG&' \
      'datatypeid=TMAX&' \
      'stationid=' + STATION

print(url)

https://www.ncdc.noaa.gov/cdo-web/api/v2/data?datasetid=GHCND&startdate=2017-01-01&enddate=2017-01-04&limit=1000&datatypeid=TMIN&datatypeid=TAVG&datatypeid=TMAX&stationid=GHCND:USW00014732


In [54]:
#Execute the URL against the NOAA API to get the results
#Prettyprint first three items from result-set

response = requests.get(url, headers=HEADERS).json()
results = response["results"]   

print(json.dumps(results[0:3], indent=2))

[
  {
    "date": "2017-01-01T00:00:00",
    "datatype": "TAVG",
    "station": "GHCND:USW00014732",
    "attributes": "H,,S,",
    "value": 80
  },
  {
    "date": "2017-01-01T00:00:00",
    "datatype": "TMAX",
    "station": "GHCND:USW00014732",
    "attributes": ",,W,2400",
    "value": 94
  },
  {
    "date": "2017-01-01T00:00:00",
    "datatype": "TMIN",
    "station": "GHCND:USW00014732",
    "attributes": ",,W,2400",
    "value": 39
  }
]


In [55]:
#Rearrange the data
cells = dict()

for record in results:
    value = record['value'] / 10
    coordinates = ("Actual", record['date'][0:10], "NYC", record['datatype'])
    cells[coordinates] = value

In [57]:
for coordinate, value in cells.items():
    print(coordinate, value)

('Actual', '2017-01-01', 'NYC', 'TAVG') 8.0
('Actual', '2017-01-01', 'NYC', 'TMAX') 9.4
('Actual', '2017-01-01', 'NYC', 'TMIN') 3.9
('Actual', '2017-01-02', 'NYC', 'TAVG') 4.4
('Actual', '2017-01-02', 'NYC', 'TMAX') 5.6
('Actual', '2017-01-02', 'NYC', 'TMIN') 3.3
('Actual', '2017-01-03', 'NYC', 'TAVG') 5.6
('Actual', '2017-01-03', 'NYC', 'TMAX') 8.3
('Actual', '2017-01-03', 'NYC', 'TMIN') 4.4
('Actual', '2017-01-04', 'NYC', 'TAVG') 8.2
('Actual', '2017-01-04', 'NYC', 'TMAX') 12.2
('Actual', '2017-01-04', 'NYC', 'TMIN') 2.2


In [58]:
# Write values back to TM1
tm1.cubes.cells.write_values("Weather Data", cells)

<Response [204]>