# Introduction
Le but de ce notebook est de relever les anomalies présentes sur les notices d'exemplaire afin de pouvoir exporter des listes à destination des aquéreurs. Le script se compose des parties suivantes :

1. La visualisation de la table des notices d'exemplaire

2. Une suite de requêtes sur chaque colonne de la table
+ Les anomalies sur les codes-barre


# 1. Visualisation de la table notices d'exemplaire

In [26]:
import pandas as pd

from kiblib.utils.db import DbConn

In [27]:
colonnes_a_exporter = ['barcode',
       'dateaccessioned', 'homebranch', 'price',
       'replacementprice', 'datelastborrowed',
       'datelastseen', 'notforloan', 'damaged', 'damaged_on',
       'itemlost', 'itemlost_on', 'withdrawn', 'withdrawn_on',
       'itemcallnumber','holdingbranch', 'timestamp', 'location',
       'onloan', 'ccode','itemtype']

In [28]:
db_conn = DbConn().create_engine()

In [29]:
query = """SELECT i.itemnumber, i.biblionumber, i.biblioitemnumber, i.barcode, i.dateaccessioned, i.booksellerid, i.homebranch, i.price, i.replacementprice, i.replacementpricedate, i.datelastborrowed, i.datelastseen, i.stack, i.notforloan, i.damaged, i.damaged_on, i.itemlost, i.itemlost_on, i.withdrawn, i.withdrawn_on, i.itemcallnumber, i.coded_location_qualifier, i.issues, i.renewals, i.reserves, i.restricted, i.itemnotes, i.itemnotes_nonpublic, i.holdingbranch,i.timestamp, i.location, i.permanent_location, i.onloan, i.cn_source, i.cn_sort, i.ccode, i.materials, i.uri, i.itype, i.more_subfields_xml, i.enumchron, i.copynumber, i.stocknumber, i.new_status, i.exclude_from_local_holds_priority, bi.itemtype
FROM koha_prod.items i
JOIN koha_prod.biblioitems bi ON bi.biblionumber = i.biblionumber """

In [30]:
items = pd.read_sql(query, db_conn)

items

In [31]:
items['homebranch'].value_counts(normalize=True)

MED    0.960901
MUS    0.025366
BUS    0.013733
Name: homebranch, dtype: float64

In [32]:
items[items['homebranch'].isna()] #Equivaut ici à sélectionner avec une condition (WHERE)

Unnamed: 0,itemnumber,biblionumber,biblioitemnumber,barcode,dateaccessioned,booksellerid,homebranch,price,replacementprice,replacementpricedate,...,materials,uri,itype,more_subfields_xml,enumchron,copynumber,stocknumber,new_status,exclude_from_local_holds_priority,itemtype


In [41]:
barcode = items[items['barcode'].isna()]
barcode['notforloan'].value_counts()

-1    212
-2     35
 0     13
-4      3
-3      2
Name: notforloan, dtype: int64

Pour sélectionner des valeurs dans une colonnes (// IN en SQL) il existe **2 méthode** :
* .isin : permet de sélectionner les valeurs
* ~ devant le nom de la colonne + .isin : sélectionner toutes les valeurs qui ne correspondent pas à celles sélectionnées

In [34]:
anomalies1 = barcode[barcode['notforloan'].isin([0,-4,-3])]

In [35]:
barcode[~barcode['notforloan'].isin([-1,-2])]

Unnamed: 0,itemnumber,biblionumber,biblioitemnumber,barcode,dateaccessioned,booksellerid,homebranch,price,replacementprice,replacementpricedate,...,materials,uri,itype,more_subfields_xml,enumchron,copynumber,stocknumber,new_status,exclude_from_local_holds_priority,itemtype
55543,432377,84901,84901,,2021-09-18,,MED,,,2021-09-18,...,,,PRETLIV,,,,,,,PA
88437,142053,125932,125932,,2005-03-25,,MED,40.0,40.0,,...,,,PRETLIV,,,,,,,LI
108384,394939,154693,154693,,2019-06-12,,MED,,,2019-06-12,...,,,PRETPER,,,,,,,PE
109467,398519,154781,154781,,2019-09-13,,MED,,,2019-09-13,...,,,PRETPER,,,,,,,PE
123056,437204,170136,170136,,2021-12-21,,MED,,,2021-12-21,...,,,PRETPER,,,,,,,PE
150394,398200,207159,207159,,2019-09-06,,MED,,,2019-09-06,...,,,PRETLIV,"<?xml version=""1.0"" encoding=""UTF-8""?>\n<colle...",,,,,,DV
154314,393851,212532,212532,,2019-05-23,,MED,,,2019-05-23,...,,,PRETLIV,,,,,,,LI
176347,382325,239036,239036,,2018-10-03,,MED,,,2018-10-03,...,,,PRETLIV,,,,,,,LI
185236,285808,249464,249464,,2012-09-15,,MED,12.0,12.0,,...,,,PRETLIV,,,,,,,LI
188624,444042,253811,253811,,2022-05-17,,MED,,,2022-05-17,...,,,PRETLIV,"<?xml version=""1.0"" encoding=""UTF-8""?>\n<colle...",,,,,,CA


In [36]:
anomalies1[colonnes_a_exporter].to_excel('liste_anomalies1.xlsx',index=False)

In [37]:
anomalies1.columns

Index(['itemnumber', 'biblionumber', 'biblioitemnumber', 'barcode',
       'dateaccessioned', 'booksellerid', 'homebranch', 'price',
       'replacementprice', 'replacementpricedate', 'datelastborrowed',
       'datelastseen', 'stack', 'notforloan', 'damaged', 'damaged_on',
       'itemlost', 'itemlost_on', 'withdrawn', 'withdrawn_on',
       'itemcallnumber', 'coded_location_qualifier', 'issues', 'renewals',
       'reserves', 'restricted', 'itemnotes', 'itemnotes_nonpublic',
       'holdingbranch', 'timestamp', 'location', 'permanent_location',
       'onloan', 'cn_source', 'cn_sort', 'ccode', 'materials', 'uri', 'itype',
       'more_subfields_xml', 'enumchron', 'copynumber', 'stocknumber',
       'new_status', 'exclude_from_local_holds_priority', 'itemtype'],
      dtype='object')

# Vérifier la structure des codes barre pour la prochaine fois