Table 2.6.1: Accommodation status of all young people in treatment 2016-17

In [1]:
from gssutils import *

if is_interactive():
    import requests
    from cachecontrol import CacheControl
    from cachecontrol.caches.file_cache import FileCache
    from cachecontrol.heuristics import LastModified
    from pathlib import Path

    session = CacheControl(requests.Session(),
                           cache=FileCache('.cache'),
                           heuristic=LastModified())

    sourceFolder = Path('in')
    sourceFolder.mkdir(exist_ok=True)

    inputURL = 'https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/664944/'\
                    'Young-people-statistics-data-tables-from-the-national-drug-treatment-monitoring-system-2016-2017.xls'
    inputFile = sourceFolder / 'Young-people-statistics-data-tables-from-the-national-drug-treatment-monitoring-system-2016-2017.xls'
    response = session.get(inputURL)
    with open(inputFile, 'wb') as f:
      f.write(response.content)    

In [2]:
tab = loadxlstabs(inputFile, sheetids='2.6.1 Accommodation')[0]

Loading in\Young-people-statistics-data-tables-from-the-national-drug-treatment-monitoring-system-2016-2017.xls which has size 281600 bytes
Table names: ['2.6.1 Accommodation']


In [3]:
observations = tab.excel_ref('B4').expand(DOWN).expand(RIGHT).is_not_blank()
observations

{<B9 59.0>, <B12 267.0>, <C8 0.01>, <B5 1292.0>, <C6 0.04>, <C9 0.0>, <B8 153.0>, <C5 0.08>, <C10 0.0>, <B6 654.0>, <C4 0.83>, <C7 0.03>, <B7 467.0>, <C11 1.0>, <B13 16436.0>, <B11 16169.0>, <B4 13500.0>, <B10 44.0>}

In [4]:
referral = tab.excel_ref('A4').expand(DOWN).is_not_blank() - tab.excel_ref('A32')
referral

{<A12 'Missing or inconsistent data'>, <A11 'Total'>, <A9 'Independent – no fixed abode'>, <A5 'YP living in care'>, <A6 'YP supported housing'>, <A13 'Total'>, <A8 'Independent – unsettled accommodation / housing problem'>, <A7 'Independent – settled accommodation'>, <A10 'YP living in secure care'>, <A4 'Living with parents or other relatives'>}

In [5]:
measuretype = tab.excel_ref('B3').expand(RIGHT).is_not_blank() 
measuretype

{<B3 'n'>, <C3 '%'>}

In [6]:
Dimensions = [
            HDimConst('Substance','All'),
            HDim(referral,'Basis of treatment',DIRECTLY,LEFT),
            HDim(measuretype,'Measure Type',DIRECTLY,ABOVE),
            HDimConst('Unit','People')            
            ]

In [7]:
c1 = ConversionSegment(observations, Dimensions, processTIMEUNIT=True)
# if is_interactive():
#     savepreviewhtml(c1)

In [8]:
new_table = c1.topandas()
new_table




Unnamed: 0,OBS,Substance,Basis of treatment,Measure Type,Unit
0,13500.0,All,Living with parents or other relatives,n,People
1,0.83,All,Living with parents or other relatives,%,People
2,1292.0,All,YP living in care,n,People
3,0.08,All,YP living in care,%,People
4,654.0,All,YP supported housing,n,People
5,0.04,All,YP supported housing,%,People
6,467.0,All,Independent – settled accommodation,n,People
7,0.03,All,Independent – settled accommodation,%,People
8,153.0,All,Independent – unsettled accommodation / housin...,n,People
9,0.01,All,Independent – unsettled accommodation / housin...,%,People


In [9]:
new_table = new_table[new_table['OBS'] != 0 ]

In [10]:
new_table.columns = ['Value' if x=='OBS' else x for x in new_table.columns]

In [11]:
new_table['Basis of treatment'] = new_table['Basis of treatment'].map(
    lambda x: {
        'Total' : 'All' 
        }.get(x, x))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [12]:
new_table['Basis of treatment'].fillna('All including missing and inconsistent data', inplace = True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._update_inplace(new_data)


In [13]:
new_table['Basis of treatment'] = 'Accommodation status/' + new_table['Basis of treatment']

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [14]:
new_table.head()

Unnamed: 0,Value,Substance,Basis of treatment,Measure Type,Unit
0,13500.0,All,Accommodation status/Living with parents or ot...,n,People
1,0.83,All,Accommodation status/Living with parents or ot...,%,People
2,1292.0,All,Accommodation status/YP living in care,n,People
3,0.08,All,Accommodation status/YP living in care,%,People
4,654.0,All,Accommodation status/YP supported housing,n,People


In [15]:
new_table['Clients in treatment'] = 'All young clients'

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [16]:
new_table.head()

Unnamed: 0,Value,Substance,Basis of treatment,Measure Type,Unit,Clients in treatment
0,13500.0,All,Accommodation status/Living with parents or ot...,n,People,All young clients
1,0.83,All,Accommodation status/Living with parents or ot...,%,People,All young clients
2,1292.0,All,Accommodation status/YP living in care,n,People,All young clients
3,0.08,All,Accommodation status/YP living in care,%,People,All young clients
4,654.0,All,Accommodation status/YP supported housing,n,People,All young clients


In [17]:
new_table['Measure Type'] = new_table['Measure Type'].map(
    lambda x: {
        'n' : 'Count', 
        '%' : 'Percentage',
        }.get(x, x))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [18]:
new_table.tail()

Unnamed: 0,Value,Substance,Basis of treatment,Measure Type,Unit,Clients in treatment
12,44.0,All,Accommodation status/YP living in secure care,Count,People,All young clients
14,16169.0,All,Accommodation status/All,Count,People,All young clients
15,1.0,All,Accommodation status/All,Percentage,People,All young clients
16,267.0,All,Accommodation status/Missing or inconsistent data,Count,People,All young clients
17,16436.0,All,Accommodation status/All,Count,People,All young clients


In [19]:
new_table.dtypes

Value                   float64
Substance                object
Basis of treatment       object
Measure Type             object
Unit                     object
Clients in treatment     object
dtype: object

In [20]:
new_table['Value'] = new_table['Value'].astype(str)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [21]:
new_table.head(3)

Unnamed: 0,Value,Substance,Basis of treatment,Measure Type,Unit,Clients in treatment
0,13500.0,All,Accommodation status/Living with parents or ot...,Count,People,All young clients
1,0.83,All,Accommodation status/Living with parents or ot...,Percentage,People,All young clients
2,1292.0,All,Accommodation status/YP living in care,Count,People,All young clients


In [22]:
new_table['Period'] = '2016-17'
new_table = new_table[['Period','Basis of treatment','Substance','Clients in treatment','Measure Type','Value','Unit']]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [23]:
if is_interactive():
    SubstancetinationFolder = Path('out')
    SubstancetinationFolder.mkdir(exist_ok=True, parents=True)
    new_table.to_csv(SubstancetinationFolder / ('table2.6.1.csv'), index = False)

In [24]:
new_table.head()

Unnamed: 0,Period,Basis of treatment,Substance,Clients in treatment,Measure Type,Value,Unit
0,2016-17,Accommodation status/Living with parents or ot...,All,All young clients,Count,13500.0,People
1,2016-17,Accommodation status/Living with parents or ot...,All,All young clients,Percentage,0.83,People
2,2016-17,Accommodation status/YP living in care,All,All young clients,Count,1292.0,People
3,2016-17,Accommodation status/YP living in care,All,All young clients,Percentage,0.08,People
4,2016-17,Accommodation status/YP supported housing,All,All young clients,Count,654.0,People
