Table 3.2.2: Interventions received by young people in treatment in 2016-17 (post November 2013 dataset change interventions)

In [1]:
from gssutils import *

if is_interactive():
    import requests
    from cachecontrol import CacheControl
    from cachecontrol.caches.file_cache import FileCache
    from cachecontrol.heuristics import LastModified
    from pathlib import Path

    session = CacheControl(requests.Session(),
                           cache=FileCache('.cache'),
                           heuristic=LastModified())

    sourceFolder = Path('in')
    sourceFolder.mkdir(exist_ok=True)

    inputURL = 'https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/664944/'\
                    'Young-people-statistics-data-tables-from-the-national-drug-treatment-monitoring-system-2016-2017.xls'
    inputFile = sourceFolder / 'Young-people-statistics-data-tables-from-the-national-drug-treatment-monitoring-system-2016-2017.xls'
    response = session.get(inputURL)
    with open(inputFile, 'wb') as f:
      f.write(response.content)    

In [2]:
tab = loadxlstabs(inputFile, sheetids='3.2.2 Interventions')[0]

Loading in\Young-people-statistics-data-tables-from-the-national-drug-treatment-monitoring-system-2016-2017.xls which has size 281600 bytes
Table names: ['3.2.2 Interventions']


In [3]:
observations = tab.excel_ref('B5').expand(DOWN).expand(RIGHT).is_not_blank()
observations

{<B9 6.0>, <C6 335.0>, <D10 0.0>, <D8 6.0>, <C10 0.0>, <C7 10.0>, <F7 0.0>, <E7 16.0>, <B10 0.0>, <C11 9846.0>, <F9 0.0>, <E8 14.0>, <C12 0.608190746803385>, <E5 15710.0>, <D12 0.004879856692816109>, <B5 14541.0>, <C9 '*'>, <D5 67.0>, <D6 8.0>, <B12 0.9272963123108283>, <E10 0.0>, <F11 1.0>, <D7 0.0>, <C5 9533.0>, <F10 '-'>, <B8 12.0>, <B6 542.0>, <E9 8.0>, <B11 15012.0>, <E6 594.0>, <B7 14.0>, <E11 16189.0>, <F8 0.0>, <D9 0.0>, <F5 0.97>, <D11 79.0>, <C8 '*'>, <F6 0.04>}

In [4]:
setting = tab.excel_ref('A5').expand(DOWN).is_not_blank() 
setting

{<A9 'YP Inpatient unit'>, <A14 '‡ This is the total number of individuals receiving each type of intervention and not a summation of the columns.'>, <A10 'No setting recorded'>, <A18 '* All numbers under five have been suppressed. Where totals could be derived, figures have been rounded to the nearest five and marked with an asterisk.'>, <A11 'Total individuals‡'>, <A6 'Home'>, <A8 'Adult setting'>, <A5 'Community'>, <A7 'YP Residential unit'>, <A12 '% of total individuals with this intervention'>, <A16 'Δ This is the total number of individuals receiving at least one intervention type in each setting and not a summation of the rows.\n'>}

In [5]:
it = tab.excel_ref('B4').expand(RIGHT).is_not_blank() 
it

{<C4 'Harm reduction (n)'>, <D4 'Pharmacological (n)'>, <B4 'Psychosocial (n)'>}

In [6]:
wt = tab.excel_ref('E3').expand(RIGHT).is_not_blank() 
wt

{<F3 'Percentage of total individuals with this setting'>, <E3 'Total individuals with this settingΔ'>}

In [7]:
Dimensions = [
            HDim(it,'Clients1',DIRECTLY,ABOVE),
            HDim(wt,'Clients2',DIRECTLY,ABOVE),
            HDim(setting,'Basis of treatment',DIRECTLY, LEFT),
            HDimConst('Unit','People')            
            ]

In [8]:
c1 = ConversionSegment(observations, Dimensions, processTIMEUNIT=True)
# if is_interactive():
#     savepreviewhtml(c1)

In [9]:
new_table = c1.topandas()
new_table




Unnamed: 0,OBS,DATAMARKER,Clients1,Clients2,Basis of treatment,Unit
0,14541.0,,Psychosocial (n),,Community,People
1,9533.0,,Harm reduction (n),,Community,People
2,67.0,,Pharmacological (n),,Community,People
3,15710.0,,,Total individuals with this settingΔ,Community,People
4,0.97,,,Percentage of total individuals with this setting,Community,People
5,542.0,,Psychosocial (n),,Home,People
6,335.0,,Harm reduction (n),,Home,People
7,8.0,,Pharmacological (n),,Home,People
8,594.0,,,Total individuals with this settingΔ,Home,People
9,0.04,,,Percentage of total individuals with this setting,Home,People


In [10]:
new_table = new_table[new_table['OBS'] != 0 ]

In [None]:
new_table = new_table[new_table['OBS'] != '' ]

In [11]:
new_table.columns = ['Value' if x=='OBS' else x for x in new_table.columns]

In [12]:
new_table.head()

Unnamed: 0,Value,DATAMARKER,Clients1,Clients2,Basis of treatment,Unit
0,14541.0,,Psychosocial (n),,Community,People
1,9533.0,,Harm reduction (n),,Community,People
2,67.0,,Pharmacological (n),,Community,People
3,15710.0,,,Total individuals with this settingΔ,Community,People
4,0.97,,,Percentage of total individuals with this setting,Community,People


In [13]:
new_table['Measure Type'] = 'Count'

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [14]:
new_table.tail()

Unnamed: 0,Value,DATAMARKER,Clients1,Clients2,Basis of treatment,Unit,Measure Type
33,16189.0,,,Total individuals with this settingΔ,Total individuals‡,People,Count
34,1.0,,,Percentage of total individuals with this setting,Total individuals‡,People,Count
35,0.927296,,Psychosocial (n),,% of total individuals with this intervention,People,Count
36,0.608191,,Harm reduction (n),,% of total individuals with this intervention,People,Count
37,0.00487986,,Pharmacological (n),,% of total individuals with this intervention,People,Count


In [15]:
new_table.dtypes

Value                 object
DATAMARKER            object
Clients1              object
Clients2              object
Basis of treatment    object
Unit                  object
Measure Type          object
dtype: object

In [16]:
new_table['Value'] = new_table['Value'].astype(str)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [17]:
new_table.head(3)

Unnamed: 0,Value,DATAMARKER,Clients1,Clients2,Basis of treatment,Unit,Measure Type
0,14541.0,,Psychosocial (n),,Community,People,Count
1,9533.0,,Harm reduction (n),,Community,People,Count
2,67.0,,Pharmacological (n),,Community,People,Count


In [18]:
new_table['Clients in treatment'] = new_table['Clients1'].fillna('') + new_table['Clients2'].fillna('')

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [19]:
new_table.head()

Unnamed: 0,Value,DATAMARKER,Clients1,Clients2,Basis of treatment,Unit,Measure Type,Clients in treatment
0,14541.0,,Psychosocial (n),,Community,People,Count,Psychosocial (n)
1,9533.0,,Harm reduction (n),,Community,People,Count,Harm reduction (n)
2,67.0,,Pharmacological (n),,Community,People,Count,Pharmacological (n)
3,15710.0,,,Total individuals with this settingΔ,Community,People,Count,Total individuals with this settingΔ
4,0.97,,,Percentage of total individuals with this setting,Community,People,Count,Percentage of total individuals with this setting


In [20]:
new_table['Basis of treatment'] =  'Setting/' + new_table['Basis of treatment'] 

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [21]:
new_table['Basis of treatment'] = new_table['Basis of treatment'].str.rstrip('‡')

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [22]:
new_table['Clients in treatment'] = new_table['Clients in treatment'].str.rstrip('Δ')

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [23]:
new_table['Period'] = '2016-17'
new_table['Substance'] = 'All'
new_table = new_table[['Period','Basis of treatment','Substance','Clients in treatment','Measure Type','Value','Unit']]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [24]:
if is_interactive():
    SubstancetinationFolder = Path('out')
    SubstancetinationFolder.mkdir(exist_ok=True, parents=True)
    new_table.to_csv(SubstancetinationFolder / ('table3.2.2.csv'), index = False)

In [25]:
new_table.tail()

Unnamed: 0,Period,Basis of treatment,Substance,Clients in treatment,Measure Type,Value,Unit
33,2016-17,Setting/Total individuals,All,Total individuals with this setting,Count,16189.0,People
34,2016-17,Setting/Total individuals,All,Percentage of total individuals with this setting,Count,1.0,People
35,2016-17,Setting/% of total individuals with this inter...,All,Psychosocial (n),Count,0.9272963123108284,People
36,2016-17,Setting/% of total individuals with this inter...,All,Harm reduction (n),Count,0.608190746803385,People
37,2016-17,Setting/% of total individuals with this inter...,All,Pharmacological (n),Count,0.0048798566928161,People
