# L-Type Calcium Channel

However, in the case of L-Type Calcium Channel (LTCC), the amount of data available for CACNA1C, the primary isoform of interest, is fairly small. As this is an important target for cardiac safety modelling, it is of interest to examine the effects of relaxing the single-isoform requirement. ChEMBL also includes data for 'protein families': here, either the precise isoform responsible for the activity is not clear and/or the activity is believed to be due to all members of a family. 

Thus, here we examine data labelled 'Voltage-gated L-type calcium channel' as well as data on all the various L-type calcium channel isoforms available in ChEMBL (CACNA1S and CACNA1D).

In addition, data for species other than Human and Rat has been included. Rabbit and Guinea Pig have been important in cardiovascular studies in a way they are not in pharmacology and toxicology in general, and there has been a considerable amount of data generated for these species. Thus, including them could be vital for capturing _e.g._ the full diversity of chemical structures tested in cardiovasular models. It is hoped that this increased coverage will compensate for the risk of increased 'noise' that could accompany increasing the number of species considered.

In [1]:
from sqlalchemy import create_engine

import requests

import lxml.html

import re

In [2]:
def insert_last_after(df, col):

    cols = df.columns.values.tolist()

    position = cols.index(col) + 1

    return df[cols[:position] + [cols[-1]] + cols[position:-1]]

In [3]:
# ChEMBL connection...

database = 'francis/francis@chempro'

engine = create_engine('oracle://' + database.replace('/', ':'))

conn = engine.connect()

### SQL used to generate table

Note the species assignment was done in SQL, but could equally well have been done in Python like the [tissue](#tissue) assignment: see the notebook '[Species](Species.ipynb)' for an example of how it could also be done.

In [4]:
print(open('TT_LTCC_CURVE_DATA.sql').read())

drop table TT_LTCC_CURVE_DATA;

create table TT_LTCC_CURVE_DATA as
select
    a.chembl_id as target_chemblid
  , a.pref_name
  , a.target_type
  , a.organism
  , case
      when (organism = 'Oryctolagus cuniculus' or assay_organism = 'Oryctolagus cuniculus' or regexp_like(description, '(^|\W)(' || 'rabbit'          || ')(\W|$)', 'i')) then 'Rabbit'
      when (organism = 'Cavia porcellus'       or assay_organism = 'Cavia porcellus'       or regexp_like(description, '(^|\W)(' || 'guinea pig'      || ')(\W|$)', 'i')) then 'Guinea Pig'
      when (organism = 'Sus scrofa'            or assay_organism = 'Sus scrofa'            or regexp_like(description, '(^|\W)(' || 'pig|porcine'     || ')(\W|$)', 'i')) then 'Pig'
      when (organism = 'Felis catus'           or assay_organism = 'Felis catus'           or regexp_like(description, '(^|\W)(' || 'cat|kitten'      || ')(\W|$)', 'i')) then 'Cat'
      when (organism = 'Bos taurus '           or assay_organism = 'Bos taurus'            or regex

In [5]:
# Read data from Oracle...

data = pd.read_sql("""
select
    *
from
    TT_LTCC_CURVE_DATA a
order by
    a.species
  , a.parent_cmpd_chemblid
  , a.active
""", engine)

data.shape

(1317, 28)

### Add compound classes 

See [Compound_Classes](Compound_classes.ipynb) notebook for details.

In [6]:
data = data.merge(pd.read_pickle('compound_class.pkl')[['compound_class']], left_on='parent_cmpd_chemblid', right_index=True)

data = insert_last_after(data, 'parent_cmpd_chemblid')

<a name='tissue'></a>
### Add tissue assignment

Tissue taken from assay description where possible. See [Tissues](Tissues.ipynb) notebook for details.

In [7]:
data = data.merge(pd.read_pickle('tissues.pkl'), how='left', left_on='description', right_index=True)

data = insert_last_after(data, 'species')

data['tissue'].fillna('', inplace=True)

In [8]:
HTML(data.head(2).to_html())

Unnamed: 0,target_chemblid,pref_name,target_type,organism,species,tissue,relationship_type,assay_chemblid,description,assay_organism,parent_cmpd_chemblid,compound_class,standard_type,standard_relation,standard_value,standard_units,pchembl_value,activity_comment,data_validity_comment,potential_duplicate,cmpd_chemblid,compound_key,published_type,published_relation,published_value,published_units,doc_chemblid,pubmed_id,reference,active
0,CHEMBL1940,Voltage-gated L-type calcium channel alpha-1C subunit,SINGLE PROTEIN,Homo sapiens,Cat,Heart,H,CHEMBL656260,Inhibition of (-)-[3H]- D-888 binding to L-type calcium channels in kitten heart ventricle membranes,Felis catus,CHEMBL138302,PAA,IC50,=,260,nM,6.59,,,,CHEMBL138302,2d,IC50,=,0.26,uM,CHEMBL1127038,8474099,"J. Med. Chem., v. 36, p. 439 (1993)",1
1,CHEMBL1940,Voltage-gated L-type calcium channel alpha-1C subunit,SINGLE PROTEIN,Homo sapiens,Cat,Heart,H,CHEMBL656260,Inhibition of (-)-[3H]- D-888 binding to L-type calcium channels in kitten heart ventricle membranes,Felis catus,CHEMBL138302,PAA,IC50,=,390,nM,6.41,,,,CHEMBL138302,2a,IC50,=,0.39,uM,CHEMBL1127038,8474099,"J. Med. Chem., v. 36, p. 439 (1993)",1


In [9]:
# Save data table...

data.to_pickle('data.pkl')

In [10]:
# # Export to Excel...
# 
# NB '=' in 'standard_relation' column are converted to 0, so export CSV instead for import into Excel.
# 
# data.to_excel('data.xlsx', index=False)

In [11]:
# Export CSV for Excel...

data.to_csv('data.csv', index=False)