Census of Drug and Alcohol Treatment Services in Northern Ireland:Breakdown by Service Type

In [1]:
from gssutils import *
if is_interactive():
    import requests
    from cachecontrol import CacheControl
    from cachecontrol.caches.file_cache import FileCache
    from cachecontrol.heuristics import LastModified
    from pathlib import Path

    session = CacheControl(requests.Session(),
                           cache=FileCache('.cache'),
                           heuristic=LastModified())

    sourceFolder = Path('in')
    sourceFolder.mkdir(exist_ok=True)

    inputURL = 'https://www.health-ni.gov.uk/sites/default/files/publications/dhssps/data-census-drug-alcohol-treatment-services.xlsx'
    inputFile = sourceFolder / 'data-census-drug-alcohol-treatment-services.xlsx'
    response = session.get(inputURL)
    with open(inputFile, 'wb') as f:
      f.write(response.content)
    tab = loadxlstabs(inputFile, sheetids='Table 2')[0]

Loading in\data-census-drug-alcohol-treatment-services.xlsx which has size 46265 bytes
Table names: ['Table 2']


In [2]:
observations = tab.excel_ref('B5').expand(DOWN).expand(RIGHT).is_not_blank() - tab.excel_ref('B12').expand(DOWN).expand(RIGHT)  


In [3]:
observations

{<E9 18.2>, <C9 61.8>, <K6 35.0>, <B8 5.0>, <H5 1010.0>, <G8 172.0>, <B5 1567.0>, <L6 1143.0>, <K9 18.9>, <M6 1178.0>, <D7 437.0>, <C6 925.0>, <K7 148.0>, <M11 0.6>, <L7 537.0>, <K10 80.0>, <H9 64.4>, <C7 571.0>, <C11 0.0>, <J6 141.0>, <I5 540.0>, <E5 528.0>, <L5 1689.0>, <H10 35.6>, <B7 493.0>, <C5 1496.0>, <J9 43.5>, <G11 4.2>, <D9 41.5>, <E8 0.0>, <F5 3567.0>, <D5 1032.0>, <I11 0.0>, <F8 172.0>, <H6 650.0>, <B11 0.3>, <F10 30.0>, <G5 4095.0>, <M5 1874.0>, <G7 1501.0>, <L10 31.8>, <I9 71.7>, <D10 42.3>, <F11 4.8>, <I8 0.0>, <J10 53.1>, <I7 153.0>, <G9 59.1>, <E11 0.0>, <D8 167.0>, <D6 428.0>, <L9 67.7>, <M8 11.0>, <J11 3.4>, <J8 11.0>, <B10 31.5>, <E10 81.8>, <E6 96.0>, <F7 1069.0>, <M9 62.9>, <M7 685.0>, <H8 0.0>, <B6 1069.0>, <I10 28.3>, <K5 185.0>, <J7 172.0>, <C8 0.0>, <H11 0.0>, <C10 38.2>, <H7 360.0>, <M10 36.6>, <F6 2326.0>, <D11 16.2>, <B9 68.2>, <G6 2422.0>, <I6 387.0>, <K11 0.6>, <G10 36.7>, <K8 11.0>, <E7 432.0>, <F9 65.2>, <J5 324.0>}

In [4]:
Service = tab.excel_ref('A5').expand(DOWN).is_not_blank()
Service

{<A13 'Service Type'>, <A11 'Prison (%)'>, <A18 'Prison'>, <A6 'Statutory'>, <A9 'Statutory (%)'>, <A15 'Total'>, <A10 'Non-statutory (%)'>, <A7 'Non-statutory'>, <A8 'Prison'>, <A17 'Non-statutory'>, <A19 'Statutory (%)'>, <A5 'Total'>, <A16 'Statutory'>, <A20 'Non-statutory (%)'>, <A21 'Prison (%)'>}

In [5]:
Treatment = tab.excel_ref('B4').expand(RIGHT).is_not_blank()
Treatment

{<L4 '18 and over'>, <M4 'Total'>, <J4 'Drugs & Alcohol'>, <B4 'Alcohol Only'>, <I4 'Drugs Only'>, <C4 'Drugs Only'>, <F4 '18 and over'>, <K4 'Under 18s'>, <D4 'Drugs & Alcohol'>, <G4 'Total'>, <E4 'Under 18s'>, <H4 'Alcohol Only'>}

In [6]:
sex = tab.excel_ref('B3').expand(RIGHT).is_not_blank()
sex

{<H3 'Female  '>, <B3 'Male'>}

In [7]:
Dimensions = [
            HDim(Treatment,'Treatment Type',DIRECTLY,ABOVE),
            HDim(Service,'Service Type',DIRECTLY,LEFT),
            HDim(sex,'Sex',CLOSEST,LEFT),
            HDimConst('Measure Type', 'Count'),
            HDimConst('Unit','People'),
            HDimConst('Period','1 March 2017'),
            HDimConst('Age','All')
            ]

In [8]:
c1 = ConversionSegment(observations, Dimensions, processTIMEUNIT=True)
# savepreviewhtml(c1)

In [9]:
new_table = c1.topandas()
new_table




Unnamed: 0,OBS,Treatment Type,Service Type,Sex,Measure Type,Unit,Period,Age
0,1567.0,Alcohol Only,Total,Male,Count,People,1 March 2017,All
1,1496.0,Drugs Only,Total,Male,Count,People,1 March 2017,All
2,1032.0,Drugs & Alcohol,Total,Male,Count,People,1 March 2017,All
3,528.0,Under 18s,Total,Male,Count,People,1 March 2017,All
4,3567.0,18 and over,Total,Male,Count,People,1 March 2017,All
5,4095.0,Total,Total,Male,Count,People,1 March 2017,All
6,1010.0,Alcohol Only,Total,Female,Count,People,1 March 2017,All
7,540.0,Drugs Only,Total,Female,Count,People,1 March 2017,All
8,324.0,Drugs & Alcohol,Total,Female,Count,People,1 March 2017,All
9,185.0,Under 18s,Total,Female,Count,People,1 March 2017,All


In [10]:
new_table.columns = ['Value' if x=='OBS' else x for x in new_table.columns]

In [11]:
new_table.dtypes

Value             float64
Treatment Type     object
Service Type       object
Sex                object
Measure Type       object
Unit               object
Period             object
Age                object
dtype: object

In [12]:
new_table.tail(5)

Unnamed: 0,Value,Treatment Type,Service Type,Sex,Measure Type,Unit,Period,Age
77,0.0,Alcohol Only,Prison (%),Female,Count,People,1 March 2017,All
78,0.0,Drugs Only,Prison (%),Female,Count,People,1 March 2017,All
79,3.4,Drugs & Alcohol,Prison (%),Female,Count,People,1 March 2017,All
80,0.6,Under 18s,Prison (%),Female,Count,People,1 March 2017,All
81,0.6,Total,Prison (%),Female,Count,People,1 March 2017,All


In [13]:
new_table.count()

Value             82
Treatment Type    82
Service Type      82
Sex               82
Measure Type      82
Unit              82
Period            82
Age               82
dtype: int64

In [14]:
new_table = new_table[new_table['Value'] !=  0 ]

In [15]:
new_table.count()

Value             74
Treatment Type    74
Service Type      74
Sex               74
Measure Type      74
Unit              74
Period            74
Age               74
dtype: int64

In [16]:
new_table['Treatment Type'].fillna('All', inplace = True)
# new_table['Service Type'] = 'All'
new_table['Residential Status'] = 'All'
new_table['Health and Social Care Trust']  = 'All'

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._update_inplace(new_data)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  This is separate from the ipykernel package so we can avoid doing imports until
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  after removing the cwd from sys.path.


In [17]:
new_table = new_table[['Period', 'Sex', 'Age', 'Service Type', 'Residential Status', 'Treatment Type', 'Health and Social Care Trust', 'Measure Type', 'Unit', 'Value']]

In [18]:
new_table

Unnamed: 0,Period,Sex,Age,Service Type,Residential Status,Treatment Type,Health and Social Care Trust,Measure Type,Unit,Value
0,1 March 2017,Male,All,Total,All,Alcohol Only,All,Count,People,1567.0
1,1 March 2017,Male,All,Total,All,Drugs Only,All,Count,People,1496.0
2,1 March 2017,Male,All,Total,All,Drugs & Alcohol,All,Count,People,1032.0
3,1 March 2017,Male,All,Total,All,Under 18s,All,Count,People,528.0
4,1 March 2017,Male,All,Total,All,18 and over,All,Count,People,3567.0
5,1 March 2017,Male,All,Total,All,Total,All,Count,People,4095.0
6,1 March 2017,Female,All,Total,All,Alcohol Only,All,Count,People,1010.0
7,1 March 2017,Female,All,Total,All,Drugs Only,All,Count,People,540.0
8,1 March 2017,Female,All,Total,All,Drugs & Alcohol,All,Count,People,324.0
9,1 March 2017,Female,All,Total,All,Under 18s,All,Count,People,185.0


In [19]:
new_table.tail()

Unnamed: 0,Period,Sex,Age,Service Type,Residential Status,Treatment Type,Health and Social Care Trust,Measure Type,Unit,Value
75,1 March 2017,Male,All,Prison (%),All,18 and over,All,Count,People,4.8
76,1 March 2017,Male,All,Prison (%),All,Total,All,Count,People,4.2
79,1 March 2017,Female,All,Prison (%),All,Drugs & Alcohol,All,Count,People,3.4
80,1 March 2017,Female,All,Prison (%),All,Under 18s,All,Count,People,0.6
81,1 March 2017,Female,All,Prison (%),All,Total,All,Count,People,0.6
