Table 2.4.1: Source of referral of all treatment episodes 2016-17

In [1]:
from gssutils import *

if is_interactive():
    import requests
    from cachecontrol import CacheControl
    from cachecontrol.caches.file_cache import FileCache
    from cachecontrol.heuristics import LastModified
    from pathlib import Path

    session = CacheControl(requests.Session(),
                           cache=FileCache('.cache'),
                           heuristic=LastModified())

    sourceFolder = Path('in')
    sourceFolder.mkdir(exist_ok=True)

    inputURL = 'https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/664944/'\
                    'Young-people-statistics-data-tables-from-the-national-drug-treatment-monitoring-system-2016-2017.xls'
    inputFile = sourceFolder / 'Young-people-statistics-data-tables-from-the-national-drug-treatment-monitoring-system-2016-2017.xls'
    response = session.get(inputURL)
    with open(inputFile, 'wb') as f:
      f.write(response.content)    

In [2]:
tab = loadxlstabs(inputFile, sheetids='2.5.1 Education & Employment')[0]

Loading in\Young-people-statistics-data-tables-from-the-national-drug-treatment-monitoring-system-2016-2017.xls which has size 281600 bytes
Table names: ['2.5.1 Education & Employment']


In [3]:
observations = tab.excel_ref('B4').expand(DOWN).expand(RIGHT).is_not_blank()
observations

{<B11 12.0>, <B9 228.0>, <C11 0.0>, <C7 0.04>, <B7 432.0>, <B12 11392.0>, <C12 1.0>, <B6 1788.0>, <B5 2199.0>, <B13 361.0>, <B14 11753.0>, <B4 6389.0>, <C9 0.02>, <B10 15.0>, <C8 0.03>, <C10 0.0>, <C4 0.56>, <C6 0.16>, <B8 329.0>, <C5 0.19>}

In [4]:
referral = tab.excel_ref('A4').expand(DOWN).is_not_blank() 
referral

{<A8 'Employed'>, <A10 'Economically inactive – health issue or caring role'>, <A14 'Total new presentations'>, <A9 'Persistent absentee or excluded'>, <A13 'Missing or inconsistent data'>, <A12 'Total'>, <A7 'Apprenticeship or training'>, <A6 'Not in employment or education or training (NEET)'>, <A11 'Voluntary work'>, <A5 'Alternative education'>, <A4 'Mainstream education'>}

In [5]:
measuretype = tab.excel_ref('B3').expand(RIGHT).is_not_blank() 
measuretype

{<C3 '%'>, <B3 'n'>}

In [6]:
Dimensions = [
            HDimConst('Substance','All'),
            HDim(referral,'Basis of treatment',DIRECTLY,LEFT),
            HDim(measuretype,'Measure Type',DIRECTLY,ABOVE),
            HDimConst('Unit','People')            
            ]

In [7]:
c1 = ConversionSegment(observations, Dimensions, processTIMEUNIT=True)
# if is_interactive():
#     savepreviewhtml(c1)

In [8]:
new_table = c1.topandas()
new_table




Unnamed: 0,OBS,Substance,Basis of treatment,Measure Type,Unit
0,6389.0,All,Mainstream education,n,People
1,0.56,All,Mainstream education,%,People
2,2199.0,All,Alternative education,n,People
3,0.19,All,Alternative education,%,People
4,1788.0,All,Not in employment or education or training (NEET),n,People
5,0.16,All,Not in employment or education or training (NEET),%,People
6,432.0,All,Apprenticeship or training,n,People
7,0.04,All,Apprenticeship or training,%,People
8,329.0,All,Employed,n,People
9,0.03,All,Employed,%,People


In [9]:
new_table = new_table[new_table['OBS'] != 0 ]

In [10]:
new_table.columns = ['Value' if x=='OBS' else x for x in new_table.columns]

In [11]:
new_table['Basis of treatment'] = new_table['Basis of treatment'].map(
    lambda x: {
        'Total' : 'All' 
        }.get(x, x))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [12]:
new_table['Basis of treatment'] = 'Education and employment status/' + new_table['Basis of treatment']

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [13]:
new_table.head()

Unnamed: 0,Value,Substance,Basis of treatment,Measure Type,Unit
0,6389.0,All,Education and employment status/Mainstream edu...,n,People
1,0.56,All,Education and employment status/Mainstream edu...,%,People
2,2199.0,All,Education and employment status/Alternative ed...,n,People
3,0.19,All,Education and employment status/Alternative ed...,%,People
4,1788.0,All,Education and employment status/Not in employm...,n,People


In [14]:
new_table['Clients in treatment'] = 'All young clients'

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [15]:
new_table.head()

Unnamed: 0,Value,Substance,Basis of treatment,Measure Type,Unit,Clients in treatment
0,6389.0,All,Education and employment status/Mainstream edu...,n,People,All young clients
1,0.56,All,Education and employment status/Mainstream edu...,%,People,All young clients
2,2199.0,All,Education and employment status/Alternative ed...,n,People,All young clients
3,0.19,All,Education and employment status/Alternative ed...,%,People,All young clients
4,1788.0,All,Education and employment status/Not in employm...,n,People,All young clients


In [16]:
new_table['Measure Type'] = new_table['Measure Type'].map(
    lambda x: {
        'n' : 'Count', 
        '%' : 'Percentage',
        }.get(x, x))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [17]:
new_table.tail()

Unnamed: 0,Value,Substance,Basis of treatment,Measure Type,Unit,Clients in treatment
14,12.0,All,Education and employment status/Voluntary work,Count,People,All young clients
16,11392.0,All,Education and employment status/All,Count,People,All young clients
17,1.0,All,Education and employment status/All,Percentage,People,All young clients
18,361.0,All,Education and employment status/Missing or inc...,Count,People,All young clients
19,11753.0,All,Education and employment status/Total new pres...,Count,People,All young clients


In [18]:
new_table.dtypes

Value                   float64
Substance                object
Basis of treatment       object
Measure Type             object
Unit                     object
Clients in treatment     object
dtype: object

In [19]:
new_table['Value'] = new_table['Value'].astype(str)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [20]:
new_table.head(3)

Unnamed: 0,Value,Substance,Basis of treatment,Measure Type,Unit,Clients in treatment
0,6389.0,All,Education and employment status/Mainstream edu...,Count,People,All young clients
1,0.56,All,Education and employment status/Mainstream edu...,Percentage,People,All young clients
2,2199.0,All,Education and employment status/Alternative ed...,Count,People,All young clients


In [21]:
new_table['Period'] = '2016-17'
new_table = new_table[['Period','Basis of treatment','Substance','Clients in treatment','Measure Type','Value','Unit']]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [22]:
if is_interactive():
    SubstancetinationFolder = Path('out')
    SubstancetinationFolder.mkdir(exist_ok=True, parents=True)
    new_table.to_csv(SubstancetinationFolder / ('table2.5.1.csv'), index = False)

In [23]:
new_table.head()

Unnamed: 0,Period,Basis of treatment,Substance,Clients in treatment,Measure Type,Value,Unit
0,2016-17,Education and employment status/Mainstream edu...,All,All young clients,Count,6389.0,People
1,2016-17,Education and employment status/Mainstream edu...,All,All young clients,Percentage,0.56,People
2,2016-17,Education and employment status/Alternative ed...,All,All young clients,Count,2199.0,People
3,2016-17,Education and employment status/Alternative ed...,All,All young clients,Percentage,0.19,People
4,2016-17,Education and employment status/Not in employm...,All,All young clients,Count,1788.0,People
