Table 2.4.1: Source of referral of all treatment episodes 2016-17

In [1]:
from gssutils import *

if is_interactive():
    import requests
    from cachecontrol import CacheControl
    from cachecontrol.caches.file_cache import FileCache
    from cachecontrol.heuristics import LastModified
    from pathlib import Path

    session = CacheControl(requests.Session(),
                           cache=FileCache('.cache'),
                           heuristic=LastModified())

    sourceFolder = Path('in')
    sourceFolder.mkdir(exist_ok=True)

    inputURL = 'https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/664944/'\
                    'Young-people-statistics-data-tables-from-the-national-drug-treatment-monitoring-system-2016-2017.xls'
    inputFile = sourceFolder / 'Young-people-statistics-data-tables-from-the-national-drug-treatment-monitoring-system-2016-2017.xls'
    response = session.get(inputURL)
    with open(inputFile, 'wb') as f:
      f.write(response.content)    

In [2]:
tab = loadxlstabs(inputFile, sheetids='2.4.1 Referral Source')[0]

Loading in\Young-people-statistics-data-tables-from-the-national-drug-treatment-monitoring-system-2016-2017.xls which has size 281600 bytes
Table names: ['2.4.1 Referral Source']


In [3]:
observations = tab.excel_ref('B4').expand(DOWN).expand(RIGHT).is_not_blank()
observations

{<B10 182.0>, <C21 0.04>, <B30 17703.0>, <C30 1.0>, <B15 305.0>, <B29 263.0>, <B9 3803.0>, <B16 2727.0>, <B24 168.0>, <C19 0.11>, <C5 0.04>, <C27 0.09>, <C4 0.22>, <C24 0.01>, <B23 238.0>, <C23 0.01>, <B5 769.0>, <C12 0.25>, <B25 112.0>, <B18 882.0>, <B7 7.0>, <C17 0.06>, <C26 0.0>, <C25 0.01>, <C28 0.02>, <C20 0.07>, <B8 5168.0>, <C22 0.02>, <B12 4358.0>, <B13 2095.0>, <B31 20.0>, <B17 1091.0>, <B28 294.0>, <C16 0.15>, <B32 17723.0>, <B6 420.0>, <B4 3972.0>, <B26 69.0>, <C9 0.21>, <C14 0.02>, <C11 0.02>, <C18 0.05>, <C29 0.01>, <B21 742.0>, <B19 1973.0>, <C6 0.02>, <C8 0.29>, <B22 324.0>, <B20 1267.0>, <C13 0.12>, <C7 0.0>, <C15 0.02>, <B11 373.0>, <B14 327.0>, <C10 0.01>, <B27 1653.0>}

In [4]:
referral = tab.excel_ref('A4').expand(DOWN).is_not_blank() - tab.excel_ref('A32')
referral

{<A28 'YP housing'>, <A21 'CAMHS'>, <A23 'A&E'>, <A16 'Social care total'>, <A17 'Self'>, <A10 'YP secure estate'>, <A25 'Hospital'>, <A24 'GP'>, <A4 'Mainstream education'>, <A9 'YOT'>, <A29 'Other'>, <A30 'Total (episodes)'>, <A18 'Relative, family, friend or concerned other'>, <A12 'Youth / criminal justice total'>, <A8 'Education total'>, <A20 'Substance misuse total'>, <A6 'Education service'>, <A13 'Children and family services'>, <A15 'Social services'>, <A7 'Other'>, <A14 'Looked after child services'>, <A26 'Other'>, <A11 'Other'>, <A27 'Health total'>, <A22 'School nurse'>, <A31 'Missing or inconsistent data'>, <A5 'Alternative education'>, <A19 'Self, family & friends total'>}

In [5]:
measuretype = tab.excel_ref('B3').expand(RIGHT).is_not_blank() 
measuretype

{<C3 '%'>, <B3 'n'>}

In [6]:
Dimensions = [
            HDimConst('Substance','All'),
            HDim(referral,'Basis of treatment',DIRECTLY,LEFT),
            HDim(measuretype,'Measure Type',DIRECTLY,ABOVE),
            HDimConst('Unit','People')            
            ]

In [7]:
c1 = ConversionSegment(observations, Dimensions, processTIMEUNIT=True)
# if is_interactive():
#     savepreviewhtml(c1)

In [8]:
new_table = c1.topandas()
new_table




Unnamed: 0,OBS,Substance,Basis of treatment,Measure Type,Unit
0,3972.0,All,Mainstream education,n,People
1,0.22,All,Mainstream education,%,People
2,769.0,All,Alternative education,n,People
3,0.04,All,Alternative education,%,People
4,420.0,All,Education service,n,People
5,0.02,All,Education service,%,People
6,7.0,All,Other,n,People
7,0.0,All,Other,%,People
8,5168.0,All,Education total,n,People
9,0.29,All,Education total,%,People


In [9]:
new_table = new_table[new_table['OBS'] != 0 ]

In [10]:
new_table.columns = ['Value' if x=='OBS' else x for x in new_table.columns]

In [11]:
new_table['Basis of treatment'] = new_table['Basis of treatment'].map(
    lambda x: {
        'Total (episodes)' : 'All episodes' 
        }.get(x, x))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [12]:
new_table['Basis of treatment'].fillna('All episodes including missing and inconsistent data', inplace = True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self._update_inplace(new_data)


In [13]:
new_table['Basis of treatment'] = 'Referral source/' + new_table['Basis of treatment']

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [14]:
new_table.head()

Unnamed: 0,Value,Substance,Basis of treatment,Measure Type,Unit
0,3972.0,All,Referral source/Mainstream education,n,People
1,0.22,All,Referral source/Mainstream education,%,People
2,769.0,All,Referral source/Alternative education,n,People
3,0.04,All,Referral source/Alternative education,%,People
4,420.0,All,Referral source/Education service,n,People


In [15]:
new_table['Clients in treatment'] = 'All young clients'

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [16]:
new_table.head()

Unnamed: 0,Value,Substance,Basis of treatment,Measure Type,Unit,Clients in treatment
0,3972.0,All,Referral source/Mainstream education,n,People,All young clients
1,0.22,All,Referral source/Mainstream education,%,People,All young clients
2,769.0,All,Referral source/Alternative education,n,People,All young clients
3,0.04,All,Referral source/Alternative education,%,People,All young clients
4,420.0,All,Referral source/Education service,n,People,All young clients


In [17]:
new_table['Measure Type'] = new_table['Measure Type'].map(
    lambda x: {
        'n' : 'Count', 
        '%' : 'Percentage',
        }.get(x, x))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [18]:
new_table.tail()

Unnamed: 0,Value,Substance,Basis of treatment,Measure Type,Unit,Clients in treatment
51,0.01,All,Referral source/Other,Percentage,People,All young clients
52,17703.0,All,Referral source/All episodes,Count,People,All young clients
53,1.0,All,Referral source/All episodes,Percentage,People,All young clients
54,20.0,All,Referral source/Missing or inconsistent data,Count,People,All young clients
55,17723.0,All,Referral source/All episodes including missing...,Count,People,All young clients


In [19]:
new_table.dtypes

Value                   float64
Substance                object
Basis of treatment       object
Measure Type             object
Unit                     object
Clients in treatment     object
dtype: object

In [20]:
new_table['Value'] = new_table['Value'].astype(str)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [21]:
new_table.head(3)

Unnamed: 0,Value,Substance,Basis of treatment,Measure Type,Unit,Clients in treatment
0,3972.0,All,Referral source/Mainstream education,Count,People,All young clients
1,0.22,All,Referral source/Mainstream education,Percentage,People,All young clients
2,769.0,All,Referral source/Alternative education,Count,People,All young clients


In [22]:
new_table['Period'] = '2016-17'
new_table = new_table[['Period','Basis of treatment','Substance','Clients in treatment','Measure Type','Value','Unit']]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [23]:
if is_interactive():
    SubstancetinationFolder = Path('out')
    SubstancetinationFolder.mkdir(exist_ok=True, parents=True)
    new_table.to_csv(SubstancetinationFolder / ('table2.4.1.csv'), index = False)

In [24]:
new_table.head()

Unnamed: 0,Period,Basis of treatment,Substance,Clients in treatment,Measure Type,Value,Unit
0,2016-17,Referral source/Mainstream education,All,All young clients,Count,3972.0,People
1,2016-17,Referral source/Mainstream education,All,All young clients,Percentage,0.22,People
2,2016-17,Referral source/Alternative education,All,All young clients,Count,769.0,People
3,2016-17,Referral source/Alternative education,All,All young clients,Percentage,0.04,People
4,2016-17,Referral source/Education service,All,All young clients,Count,420.0,People
